Conversation
* Install zappa and initial example configuration * add temporary IaC code for testing * Update code * Remove IAM and let Zappa manage it * add README * add RDS proxy, pre-commit hooks, clean-up code * feat: add ecs tasks * refactor/clean cache module * refactor/clean database module * refactor/clean ecs module * refactor/clean networking module * address Sonar Qube bot suggestions * keep some Sonar Qube bot suggestions but add #NOSONAR * add terraform dictionary * refactor/clean security module * fix pre-commit hooks and add terraform_validate * add SHA hash and remove terraform_validate * refactor/clean storage module * Update docs * add S3 support for load-data task * Update README and necessary examples * coderabbit suggestions * Update code: install awscli before invoking aws in task * update README * NOSONAR * update README * fix spell check * Update README * fix: typo * remove immutable tagged images * fix: load-data task fail due to no permissions * fix spell check: add PYTHONPATH * add AWS CLI * add set -e * Update code --------- Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org>
…2551) * Use AWS SSM Parameter Store to handle environment variables * Use focused policy for read access * Update documentation * Add flag for create_rds_proxy * set default value of create_rds_proxy to false * Populate Zappa/Lambda environment variables from ssm/parameter store * Update documentation * Update example * add default configurations * add security group db from lambda
* Use AWS SSM Parameter Store to handle environment variables * Use focused policy for read access * Update documentation * Add flag for create_rds_proxy * set default value of create_rds_proxy to false * Populate Zappa/Lambda environment variables from ssm/parameter store * Update documentation * Update example * add default configurations * add security group db from lambda * fix load-data task by adding a --fixture-path flag * fix ecs tasks by introducing ecs-* make targets * change ecs run steps * remove ecs-* and clean code * add --no-cache * use call_command * add test for --fixture-path * Update code * Update backend/wsgi.py --------- Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org> Co-authored-by: Arkadii Yakovets <2201626+arkid15r@users.noreply.github.com>
* add remote state management * refactor staging into its own directory * change steps to account for staging * bot suggestions * sort alphabetically * Update code * Update documentation * update formatting * update code
* refactor s3 buckets into a module * add vpc flow logs, cache logs, and update ecs logs * update code
* add `random_id` to bucket name * Update documentation * change default force_destroy_bucket * make bucket names configurable and prevent_destroy * remove specifies * remove `force_destroy` flag * add prevent_destroy * add note * update docs * update docs
* delete .terraform.lock.hcl files * use stricter versioning * fix: rule cycling * sort alphabetically * add tflint config * make README.md formatting consistent * make descriptions consistent * add tags * add periods * update description and add validation * tighten egress * remove secret outputs and replace with ARNs
* allow more origins * add frontend and alb * update readme * fix spell check * sort alphabetically * fix issues * redirect http to https
…-migration-sync-main Sync feature/zappa-migration with main
* Add conditional creation to NAT and VPC endpoints * Add conditional FARGATE SPOT to ECS * Update security module * Add conditional FARGATE SPOT to frontend * Update staging configuration * Simplify frontend deployment steps * Create VPC endpoints conditionally * Simplify README * Refactor alb to be a module * Add backend to ALB * split routes * Update routes and remove code * fix spell check * Fix domain example * Update example * Update code * chamge assert to precondition * drop invalid headers * never use Fargate SPOT for migrate task * Delete frontend alb module * remove create_nat_gateway flag * rename use_fargate_spot to ecs_use_fargate_spot * never use fargate spot for index and load data tasks * Update code * fix indentation
…-migration-sync-main Sync feature/nest-zappa-migration with main
* improve cold starts * add xray and snap_start * Update zappa * Add aliasing to Lambda function version * Add zip callback to cleanup zappa package * add alias update zappa callback * add alias to cleanup older lambda function versions * Update readme * add NOSONAR * Update code * Update code * Update code --------- Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org>
…igration-sync-main
…-migration-sync-main Sync feature/nest-zappa-migration with main
* Add CI/CD to run terraform plan * Add CI/CD to run terraform apply and zappa deploy * Update zappa settings * update code * cleanup old staging setup * update terraform outputs and variables and use terraform output as source of ids * append _NAME to environment secrets
…igration-sync-main
…-migration-sync-main Sync feature/nest-zappa-migration with main
Summary by CodeRabbit
WalkthroughRemoved Ansible/Compose staging deploy; added comprehensive Terraform staging infrastructure and tests; updated CI to run Terraform/ECR/ECS flows; added Zappa/SSM integration and helpers; introduced backend fixture loader and tests; minor backend, Docker, Makefile, pre-commit and spelling/tooling updates. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 5 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
⚔️ Resolve merge conflicts (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3827 +/- ##
=======================================
Coverage 93.31% 93.32%
=======================================
Files 512 513 +1
Lines 15816 15827 +11
Branches 2178 2178
=======================================
+ Hits 14759 14770 +11
Misses 695 695
Partials 362 362
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
6 issues found across 113 files
Confidence score: 3/5
- Risk is moderate because
backend/wsgi.pylacks error handling around AWS SSM calls, so startup can fail on permissions/network/rate-limit errors. backend/zappa_callback.pydoes not paginatelist_versions_by_function, which can miss versions beyond 50 and undermine cleanup behavior in larger deployments.- Several infrastructure issues (manual ARN reconstruction, missing KMS key_id, README path confusion, and inconsistent
actions/setup-pythonpin) add configuration drift and ops friction but are less likely to break runtime behavior immediately. - Pay close attention to
backend/wsgi.py,backend/zappa_callback.py,infrastructure/modules/cache/main.tf,infrastructure/modules/storage/main.tf,infrastructure/README.md,.github/workflows/run-ci-cd.yaml- startup resilience, Lambda cleanup completeness, and config consistency.
Note: This PR contains a large number of files. cubic only reviews up to 75 files per PR, so some files may not have been reviewed.
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="backend/wsgi.py">
<violation number="1" location="backend/wsgi.py:19">
P1: Missing error handling for AWS SSM API calls. If any boto3 call fails (permissions, network, rate limiting), the application will fail to start. Consider wrapping the AWS calls in a try/except block with appropriate logging.</violation>
</file>
<file name="infrastructure/modules/storage/main.tf">
<violation number="1" location="infrastructure/modules/storage/main.tf:36">
P2: Use module output `module.fixtures_bucket.arn` instead of reconstructing the bucket ARN manually. This duplicates the bucket name logic and could lead to mismatches if the module's naming convention ever changes.</violation>
</file>
<file name="infrastructure/README.md">
<violation number="1" location="infrastructure/README.md:169">
P2: Missing directory change instruction before 'Populate ECR Repositories' section. After step 6 (Configure ALB Routing), the user is in `infrastructure/staging/`, but the Docker commands use paths relative to the repository root. Users following the documentation sequentially will encounter command failures.
Consider adding a directory change instruction at the start of this section, e.g., `cd ../../` or `cd` to repository root.</violation>
</file>
<file name="infrastructure/modules/cache/main.tf">
<violation number="1" location="infrastructure/modules/cache/main.tf:88">
P2: The SSM parameter storing the Redis password should use the same custom KMS key specified for other resources in this module. Without `key_id`, it uses AWS's default managed key, which is inconsistent with the encryption strategy applied to CloudWatch logs and may not meet your key management policies.
Consider adding: `key_id = var.kms_key_arn`</violation>
</file>
<file name="backend/zappa_callback.py">
<violation number="1" location="backend/zappa_callback.py:56">
P2: Missing pagination when listing Lambda versions. AWS `list_versions_by_function` returns max 50 versions per call. With more than 50 versions, this function won't see all versions and may fail to clean up old ones properly. Use a paginator:
```python
paginator = client.get_paginator('list_versions_by_function')
versions = []
for page in paginator.paginate(FunctionName=zappa.lambda_name):
versions.extend(page['Versions'])
```</violation>
</file>
<file name=".github/workflows/run-ci-cd.yaml">
<violation number="1" location=".github/workflows/run-ci-cd.yaml:847">
P2: Inconsistent `actions/setup-python` version. This workflow uses `@a309ff8b426b58ec0e2a45f0f869d46889d02405` in the pre-commit job but `@83679a892e2d95755f2dac6acb0bfd1e9ac5d548` here. Using different versions of the same action can lead to subtle inconsistencies and makes maintenance harder. Consider using the same commit SHA across all instances.</violation>
</file>
Since this is your first cubic review, here's how it works:
- cubic automatically reviews your code and comments on bugs and improvements
- Teach cubic by replying to its comments. cubic learns from your replies and gets better over time
- Ask questions if you need clarification on any suggestion
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| client = boto3.client("ssm") | ||
| paginator = client.get_paginator("get_parameters_by_path") | ||
| response_iterator = paginator.paginate(Path=ssm_param_path, WithDecryption=True) | ||
|
|
||
| for page in response_iterator: | ||
| for param in page["Parameters"]: | ||
| os.environ[Path(param["Name"]).name] = param["Value"] |
There was a problem hiding this comment.
P1: Missing error handling for AWS SSM API calls. If any boto3 call fails (permissions, network, rate limiting), the application will fail to start. Consider wrapping the AWS calls in a try/except block with appropriate logging.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At backend/wsgi.py, line 19:
<comment>Missing error handling for AWS SSM API calls. If any boto3 call fails (permissions, network, rate limiting), the application will fail to start. Consider wrapping the AWS calls in a try/except block with appropriate logging.</comment>
<file context>
@@ -1,10 +1,35 @@
+ if not (ssm_param_path := os.getenv("AWS_SYSTEMS_MANAGER_PARAM_STORE_PATH")):
+ return
+
+ client = boto3.client("ssm")
+ paginator = client.get_paginator("get_parameters_by_path")
+ response_iterator = paginator.paginate(Path=ssm_param_path, WithDecryption=True)
</file context>
| client = boto3.client("ssm") | |
| paginator = client.get_paginator("get_parameters_by_path") | |
| response_iterator = paginator.paginate(Path=ssm_param_path, WithDecryption=True) | |
| for page in response_iterator: | |
| for param in page["Parameters"]: | |
| os.environ[Path(param["Name"]).name] = param["Value"] | |
| try: | |
| client = boto3.client("ssm") | |
| paginator = client.get_paginator("get_parameters_by_path") | |
| response_iterator = paginator.paginate(Path=ssm_param_path, WithDecryption=True) | |
| for page in response_iterator: | |
| for param in page["Parameters"]: | |
| os.environ[Path(param["Name"]).name] = param["Value"] | |
| except Exception as exc: | |
| import logging | |
| logging.getLogger(__name__).warning( | |
| "Failed to populate environment from SSM: %s", exc | |
| ) |
| Statement = [{ | ||
| Effect = "Allow" | ||
| Action = ["s3:GetObject"] | ||
| Resource = ["arn:aws:s3:::${var.fixtures_bucket_name}-${random_id.suffix.hex}/*"] |
There was a problem hiding this comment.
P2: Use module output module.fixtures_bucket.arn instead of reconstructing the bucket ARN manually. This duplicates the bucket name logic and could lead to mismatches if the module's naming convention ever changes.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At infrastructure/modules/storage/main.tf, line 36:
<comment>Use module output `module.fixtures_bucket.arn` instead of reconstructing the bucket ARN manually. This duplicates the bucket name logic and could lead to mismatches if the module's naming convention ever changes.</comment>
<file context>
@@ -0,0 +1,49 @@
+ Statement = [{
+ Effect = "Allow"
+ Action = ["s3:GetObject"]
+ Resource = ["arn:aws:s3:::${var.fixtures_bucket_name}-${random_id.suffix.hex}/*"]
+ }]
+ })
</file context>
| terraform apply | ||
| ``` | ||
|
|
||
| ## Populate ECR Repositories |
There was a problem hiding this comment.
P2: Missing directory change instruction before 'Populate ECR Repositories' section. After step 6 (Configure ALB Routing), the user is in infrastructure/staging/, but the Docker commands use paths relative to the repository root. Users following the documentation sequentially will encounter command failures.
Consider adding a directory change instruction at the start of this section, e.g., cd ../../ or cd to repository root.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At infrastructure/README.md, line 169:
<comment>Missing directory change instruction before 'Populate ECR Repositories' section. After step 6 (Configure ALB Routing), the user is in `infrastructure/staging/`, but the Docker commands use paths relative to the repository root. Users following the documentation sequentially will encounter command failures.
Consider adding a directory change instruction at the start of this section, e.g., `cd ../../` or `cd` to repository root.</comment>
<file context>
@@ -0,0 +1,339 @@
+ terraform apply
+ ```
+
+## Populate ECR Repositories
+ECR Repositories are used to store images used by ECS (Frontend + Backend Tasks)
+
</file context>
| }) | ||
| } | ||
|
|
||
| resource "aws_ssm_parameter" "django_redis_password" { |
There was a problem hiding this comment.
P2: The SSM parameter storing the Redis password should use the same custom KMS key specified for other resources in this module. Without key_id, it uses AWS's default managed key, which is inconsistent with the encryption strategy applied to CloudWatch logs and may not meet your key management policies.
Consider adding: key_id = var.kms_key_arn
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At infrastructure/modules/cache/main.tf, line 88:
<comment>The SSM parameter storing the Redis password should use the same custom KMS key specified for other resources in this module. Without `key_id`, it uses AWS's default managed key, which is inconsistent with the encryption strategy applied to CloudWatch logs and may not meet your key management policies.
Consider adding: `key_id = var.kms_key_arn`</comment>
<file context>
@@ -0,0 +1,94 @@
+ })
+}
+
+resource "aws_ssm_parameter" "django_redis_password" {
+ description = "The password of Redis cache (Required by Django)."
+ name = "/${var.project_name}/${var.environment}/DJANGO_REDIS_PASSWORD"
</file context>
| print("Updating Lambda alias...") | ||
|
|
||
| client = boto3.client("lambda") | ||
| versions = client.list_versions_by_function(FunctionName=zappa.lambda_name)["Versions"] |
There was a problem hiding this comment.
P2: Missing pagination when listing Lambda versions. AWS list_versions_by_function returns max 50 versions per call. With more than 50 versions, this function won't see all versions and may fail to clean up old ones properly. Use a paginator:
paginator = client.get_paginator('list_versions_by_function')
versions = []
for page in paginator.paginate(FunctionName=zappa.lambda_name):
versions.extend(page['Versions'])Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At backend/zappa_callback.py, line 56:
<comment>Missing pagination when listing Lambda versions. AWS `list_versions_by_function` returns max 50 versions per call. With more than 50 versions, this function won't see all versions and may fail to clean up old ones properly. Use a paginator:
```python
paginator = client.get_paginator('list_versions_by_function')
versions = []
for page in paginator.paginate(FunctionName=zappa.lambda_name):
versions.extend(page['Versions'])
```</comment>
<file context>
@@ -0,0 +1,93 @@
+ print("Updating Lambda alias...")
+
+ client = boto3.client("lambda")
+ versions = client.list_versions_by_function(FunctionName=zappa.lambda_name)["Versions"]
+
+ if not (published := [v["Version"] for v in versions if v["Version"] != "$LATEST"]):
</file context>
| run: pipx install poetry | ||
|
|
||
| - name: Set up Python | ||
| uses: actions/setup-python@83679a892e2d95755f2dac6acb0bfd1e9ac5d548 |
There was a problem hiding this comment.
P2: Inconsistent actions/setup-python version. This workflow uses @a309ff8b426b58ec0e2a45f0f869d46889d02405 in the pre-commit job but @83679a892e2d95755f2dac6acb0bfd1e9ac5d548 here. Using different versions of the same action can lead to subtle inconsistencies and makes maintenance harder. Consider using the same commit SHA across all instances.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At .github/workflows/run-ci-cd.yaml, line 847:
<comment>Inconsistent `actions/setup-python` version. This workflow uses `@a309ff8b426b58ec0e2a45f0f869d46889d02405` in the pre-commit job but `@83679a892e2d95755f2dac6acb0bfd1e9ac5d548` here. Using different versions of the same action can lead to subtle inconsistencies and makes maintenance harder. Consider using the same commit SHA across all instances.</comment>
<file context>
@@ -644,144 +689,264 @@ jobs:
+ run: pipx install poetry
+
+ - name: Set up Python
+ uses: actions/setup-python@83679a892e2d95755f2dac6acb0bfd1e9ac5d548
+ with:
+ cache: 'poetry'
</file context>
| uses: actions/setup-python@83679a892e2d95755f2dac6acb0bfd1e9ac5d548 | |
| uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 |
There was a problem hiding this comment.
Actionable comments posted: 17
🤖 Fix all issues with AI agents
In @.github/workflows/run-ci-cd.yaml:
- Around line 755-761: The "Show plan in summary" step currently writes the raw
terraform show -no-color tfplan into $GITHUB_STEP_SUMMARY (working-directory:
infrastructure/staging), which may leak sensitive infra details; modify this by
either removing the step entirely, gating it so it only runs for
private/internal runs, or replace the direct dump with a filtered output: run
terraform show -json tfplan and pipe through jq to extract only non-sensitive
summary fields (or redact VPC/subnet/security group IDs) before appending to
$GITHUB_STEP_SUMMARY, or alternatively upload the full tfplan as a protected
artifact instead of writing it to $GITHUB_STEP_SUMMARY.
In `@backend/tests/apps/common/management/commands/load_data_test.py`:
- Around line 19-21: Remove the unused MagicMock instances `mock_model` and
`mock_app_config` (and the `mock_app_config.get_models.return_value =
[mock_model]` line) from the tests `test_handle` and
`test_handle_with_custom_fixture_path`; locate the variable assignments to
`mock_model` and `mock_app_config` in those test functions and delete them so
the tests no longer define dead variables.
- Around line 74-94: The test test_handle_with_exception_during_call_command
incorrectly patches contextlib.suppress (making the suppression accidental);
instead remove the patch of contextlib.suppress and replace the with-context
using pytest.raises(Exception): call call_command("load_data") inside
pytest.raises to assert the exception is raised, keep
mock_call_command.side_effect = Exception(...), add import pytest at the top and
remove the now-unused import contextlib, and retain the existing assertions that
mock_unregister.assert_called_once(), mock_register.assert_called_once(), and
mock_atomic.assert_called_once().
In `@backend/zappa_callback.py`:
- Around line 83-85: The deletion loop can fail when an alias (notably any alias
other than "live") points to a version in to_delete; update the code in the
to_delete loop to skip or handle versions that have alias associations by either
(A) calling client.list_aliases(FunctionName=<function name>) / list_aliases
with the version/ARN to detect any aliases pointing to that version and only
call client.delete_function(FunctionName=version["FunctionArn"]) when no aliases
exist, or (B) wrap the client.delete_function call in a try/except (catching
botocore.exceptions.ClientError) and skip/log the version when the error
indicates a resource conflict due to an alias; use the existing to_delete
variable and version["FunctionArn"] identifier and ensure "live" alias is not
assumed to be the only alias.
- Around line 39-43: The regex excludes are being tested against the full
filesystem path (str(filepath)) instead of the archive-relative path, so update
the exclude check in the loop to compute the relative path once (e.g., rel =
filepath.relative_to(temp_path) and use str(rel) or rel.as_posix()) and run the
any(re.search(p, str(rel)) for p in excludes) against that relative path before
calling tf.add(filepath, rel); keep the existing new_archive_path comparison
as-is.
- Around line 55-60: The current single call to
boto3.client("lambda").list_versions_by_function may miss versions due to AWS
pagination; replace it with a paginator
(client.get_paginator("list_versions_by_function")) to iterate all pages and
accumulate Versions before computing the published list (the code that builds
published := [v["Version"] ...] used by update_alias) and apply the same
paginated collection pattern inside cleanup_versions so both update_alias and
cleanup_versions operate on the full set of versions rather than only the first
page.
In `@infrastructure/Makefile`:
- Around line 1-12: The find invocation in the test-infrastructure Makefile
target prints entries before filtering for directories, so move the -type d
predicate before -print0 to ensure only directories named "tests" are returned;
also add a .PHONY: test-infrastructure declaration so the Makefile target is
always executed rather than treated as a file. Update the test-infrastructure
target (and add a .PHONY entry) accordingly.
In `@infrastructure/modules/cache/main.tf`:
- Around line 88-94: The aws_ssm_parameter resource "django_redis_password" is
using SecureString but not specifying a KMS CMK; update the resource to include
the KMS key by setting the key_id attribute to the module variable used
elsewhere (var.kms_key_arn) so the parameter is encrypted with the same CMK as
other resources (ensure var.kms_key_arn is available in this module and used in
aws_ssm_parameter "django_redis_password" alongside value =
local.redis_auth_token).
In `@infrastructure/modules/database/main.tf`:
- Around line 81-87: The aws_ssm_parameter resource "django_db_password" is
creating a SecureString but not specifying a KMS key; change the resource to use
the customer-managed KMS key by setting key_id = var.kms_key_arn so it uses the
same key as the module's Secrets Manager usage (reference
aws_ssm_parameter.django_db_password and var.kms_key_arn when updating the
resource).
- Around line 37-61: The aws_db_instance.main resource sets storage_encrypted =
true but doesn't supply a CMK: update aws_db_instance.main to reference the
supplied KMS key (var.kms_key_arn or a derived kms_key_id) by setting kms_key_id
= var.kms_key_arn so RDS uses the customer-managed key; additionally enable
Performance Insights for production by setting performance_insights_enabled =
true and performance_insights_kms_key_id = var.kms_key_arn (or a dedicated ARN
variable) so Performance Insights is encrypted with the same CMK; ensure names
match the resource aws_db_instance.main and variables var.kms_key_arn (or add a
new var if needed) and validate the ARN vs key ID format expected by RDS.
In `@infrastructure/modules/ecs/main.tf`:
- Around line 84-100: The IAM policy resource
aws_iam_policy.ecs_tasks_execution_role_ssm_policy currently only allows
"ssm:GetParameters"; update its policy document (the Action array in
ecs_tasks_execution_role_ssm_policy) to include "ssm:GetParameter" as well so
tasks can call the singular GetParameter API (matching the frontend policy);
ensure the jsonencode block lists both actions and keep tags/metadata unchanged.
In `@infrastructure/modules/parameters/main.tf`:
- Around line 129-135: The aws_ssm_parameter resources for the generated secrets
(e.g., aws_ssm_parameter.django_secret_key and
aws_ssm_parameter.nextauth_secret) currently track the value from
random_string/random_password and will be overwritten if those resources are
recreated; add a lifecycle block to each SSM parameter with ignore_changes =
[value] so Terraform will not update the stored secret after initial creation
(allowing deliberate rotation instead). Locate the aws_ssm_parameter resource
blocks for django_secret_key and nextauth_secret and add lifecycle {
ignore_changes = [value] } inside each resource.
- Around line 258-266: Replace the two random_string resources with
random_password resources so the secrets are marked sensitive: change resource
"random_string" "django_secret_key" and resource "random_string"
"nextauth_secret" to resource "random_password" "django_secret_key" and resource
"random_password" "nextauth_secret" (keeping length and special attributes), and
update any references elsewhere that use random_string.django_secret_key or
random_string.nextauth_secret to random_password.django_secret_key and
random_password.nextauth_secret so the .result values remain the same but are
treated as sensitive.
In `@infrastructure/modules/security/main.tf`:
- Around line 77-86: Add an egress security group rule allowing HTTPS from the
RDS Proxy security group to the VPC endpoint SG: create a new resource named
aws_security_group_rule.rds_proxy_to_vpc_endpoints that is conditional on
var.create_rds_proxy && var.create_vpc_endpoint_rules, with description "Allow
HTTPS to VPC endpoints", type "egress", protocol "tcp", from_port/to_port 443,
security_group_id set to aws_security_group.rds_proxy[0].id and
source_security_group_id set to var.vpc_endpoint_sg_id so the RDS Proxy can
reach Secrets Manager via the VPC endpoint.
In `@infrastructure/modules/storage/main.tf`:
- Around line 28-40: The IAM policy resource aws_iam_policy.fixtures_read_only
is reconstructing the bucket ARN using
${var.fixtures_bucket_name}-${random_id.suffix.hex} which duplicates naming
logic and can drift; replace the hardcoded Resource value with the S3 module's
ARN output (e.g., use module.<s3_module_name>.arn or module.fixtures_bucket.arn)
so the policy references the canonical bucket ARN output from the s3-bucket
submodule instead of rebuilding it from var.fixtures_bucket_name and
random_id.suffix.hex.
In `@infrastructure/README.md`:
- Around line 41-46: Two adjacent blockquote notes ("Copy the state bucket name
from the output." and "It is recommended to not destroy the backend resources
unless absolutely necessary.") are separated by a blank line which triggers
MD028; fix by either merging the consecutive blockquotes into a single
blockquote block (combine their lines so they appear in one contiguous >
paragraph) or add a comment directive <!-- markdownlint-disable MD028 --> above
the first blockquote and re-enable it after if preferred; update the README.md
sections containing those quoted lines accordingly.
🧹 Nitpick comments (38)
backend/apps/common/management/commands/load_data.py (1)
28-32:_optionsis accessed — drop the underscore prefix.The leading underscore conventionally signals an unused parameter, but
_options["fixture_path"]is referenced on Line 32. This is misleading to readers and linters.Proposed fix
- def handle(self, *_args, **_options) -> None: + def handle(self, *_args, **options) -> None: """Load data into the OWASP Nest application.""" - with index.disable_indexing(), transaction.atomic(): - # Run loaddata - call_command("loaddata", _options["fixture_path"], "-v", "3") + with index.disable_indexing(), transaction.atomic(): + call_command("loaddata", options["fixture_path"], "-v", "3")backend/tests/apps/common/management/commands/load_data_test.py (1)
23-24: Consider usingMagicMock(spec=contextlib.AbstractContextManager)formock_atomic.Manually wiring
__enter__and__exit__on the mock is verbose. UsingMagicMock()as a context manager already works in modernunittest.mock—MagicMocksupports the context-manager protocol out of the box, so these lines can simply be removed.Also applies to: 53-54
infrastructure/staging/terraform.tfbackend.example (1)
1-3: Consider addingkeyandencryptfields to the backend example.Without a
keyfield, Terraform defaults the state path toterraform.tfstateat the bucket root, which will collide if multiple environments or stacks share the same bucket. Addingencrypt = trueis also standard practice for S3 backends.Suggested additions
bucket = "${STATE_BUCKET_NAME}" dynamodb_table = "nest-terraform-state-lock" +encrypt = true +key = "staging/terraform.tfstate" region = "us-east-2"docker/backend/Dockerfile (1)
52-54: Minor:apk updatewrites an index cache that--no-cacheonaddwon't clean up.Consider either using
--no-cacheconsistently (which skips the index fetch) or adding&& rm -rf /var/cache/apk/*at the end. The current form works but leaves a few hundred KB of stale index data in the final image.Suggested fix
-RUN apk update && apk --no-cache add make && \ +RUN apk --no-cache add make && \ addgroup -S -g ${OWASP_GID} owasp && \ adduser -S -h /home/owasp -u ${OWASP_UID} -G owasp owaspdocker-compose/local/compose.yaml (1)
144-150: Consider removing the-zappasuffix before merging to main.Per learnings, the
-zappasuffix was introduced on the feature branch to prevent volume clashes during parallel development. Once this merges tomain, the isolation rationale no longer applies, and the suffix becomes a permanent naming oddity. Reverting to the original volume names (e.g.,backend-venv,cache-data,db-data) would keep naming clean on the main branch.Note: developers will need to recreate volumes either way after the merge, so there's no data-loss concern specific to renaming.
backend/wsgi.py (1)
14-25:populate_environ_from_ssm— no error handling around SSM calls.If SSM is unreachable or the IAM role lacks permissions, the unhandled
ClientError/BotoCoreErrorwill crash the Lambda cold start with no clear diagnostic message. For Lambda, fail-fast is reasonable, but a targetedexceptwith a clear log message would aid debugging.This is a minor observation since fail-fast is a valid strategy here — just noting it could improve operational debuggability.
infrastructure/README.md (1)
97-100: Step 3 numbering: "Populate Secrets" could clarify which secrets need manual entry.The instruction to populate
DJANGO_*secrets withto-be-set-in-aws-consolevalues is helpful. Consider listing the specific parameter names (or referencing the Terraform module that creates them) so operators don't have to hunt through the Parameter Store for which ones need attention.infrastructure/modules/storage/modules/s3-bucket/main.tf (1)
20-41: Minor: Use"*"instead of{ AWS = "*" }for thePrincipalin deny policies.AWS documentation recommends using
Principal = "*"(a plain string) rather than{ AWS = "*" }for bucket policies that deny access. While functionally equivalent in deny statements, the canonical form is cleaner and avoids potential confusion.Suggested diff
Statement = [{ Sid = "HTTPSOnly" Effect = "Deny" - Principal = { AWS = "*" } + Principal = "*" Action = "s3:*"infrastructure/modules/cache/main.tf (1)
44-50: Unnecessarycount = 1onrandom_password.The
count = 1serves no purpose here since there's no conditional — it always creates exactly one resource. Removing it simplifies the code and eliminates the need for[0]indexing inlocal.redis_auth_token.Suggested diff
resource "random_password" "redis_auth_token" { - count = 1 length = 32 # Redis auth token has specific requirements for special characters. override_special = "!$^-" special = true }Then update the local:
locals { parameter_group_name = "default.redis${local.redis_major_version}" - redis_auth_token = random_password.redis_auth_token[0].result + redis_auth_token = random_password.redis_auth_token.result redis_major_version = split(".", var.redis_engine_version)[0] }infrastructure/modules/storage/modules/s3-bucket/outputs.tf (1)
6-9: Consider exposing specific attributes instead of the entire resource object.Outputting the full
aws_s3_bucket.thisresource object couples consumers to the internal resource schema. If consumers only needid,arn,bucket, etc., exposing those individually is more intentional and resilient to provider changes.That said, for an internal module this is acceptable if it's convenient for your use case.
infrastructure/modules/networking/modules/nacl/variables.tf (1)
1-4: Inconsistent:common_tagslacks a default here but hasdefault = {}in sibling modules.Other modules (e.g.,
security/variables.tf,s3-bucket/variables.tf) definedefault = {}forcommon_tags. Missing the default here means callers must always explicitly pass it, even if empty. Consider addingdefault = {}for consistency, or deliberately omit it if tags are required for this module.infrastructure/modules/ecs/variables.tf (1)
44-48: Consider whether"latest"is a safe default forimage_tag.Defaulting to
"latest"can cause non-deterministic deployments — Terraform won't detect image changes since the tag string doesn't change, and rollbacks become harder. Consider requiring this variable explicitly (no default) to force callers to pin a specific tag.Proposed change
variable "image_tag" { description = "The Docker image tag to use for ECS tasks." type = string - default = "latest" }infrastructure/modules/database/main.tf (1)
29-34: Some special characters inoverride_specialmay cause issues in connection strings.Characters like
#,%, and&can break database connection URIs if not properly URL-encoded. Consider limiting to safer special characters.Proposed change
resource "random_password" "db_password" { count = local.generate_db_password ? 1 : 0 length = 32 - # Avoid special characters that might cause issues - override_special = "!#$%&*()-_=+[]{}<>:?" + # Use only URL-safe special characters to avoid connection string issues + override_special = "!_-" special = true }infrastructure/modules/networking/.terraform.lock.hcl (1)
1-25: Lockfile is consistent with the project's exact-pinning convention.The latest AWS provider releases go up to 6.31.0, and there's also a 6.22.1 patch release. While the team deliberately uses exact version pinning for reproducibility, consider planning a coordinated version bump across all modules when convenient — you're 9 minor versions behind. Based on learnings, the team prefers exact version pinning for maximum reproducibility.
infrastructure/modules/kms/tests/kms.tftest.hcl (2)
3-8: Policy tests validate mock data, not actual module logic.The
override_datablock on Line 3-8 replacesdata.aws_iam_policy_document.key_policywith a hardcoded JSON string. Thetest_cloudwatch_logs_policy_statement_existsandtest_iam_root_policy_statement_existsruns then regex-match against this same hardcoded string. These tests will always pass regardless of what policy statements the module actually constructs, so they don't validate the real KMS key policy logic.Consider removing the
override_datablock or, if it's needed for plan-mode execution, acknowledging in comments that these two tests are smoke-checking the test harness rather than the policy content.Also applies to: 25-32, 56-63
74-81: Replacecustomer_master_key_specwithkey_spec.The
customer_master_key_specattribute is deprecated in the AWS API and Terraform provider; usekey_specinstead. Both behave identically and accept the same values, but AWS recommends transitioning tokey_specgoing forward.Suggested fix
assert { - condition = aws_kms_key.main.customer_master_key_spec == "SYMMETRIC_DEFAULT" + condition = aws_kms_key.main.key_spec == "SYMMETRIC_DEFAULT" error_message = "KMS key must use SYMMETRIC_DEFAULT key spec." }infrastructure/modules/parameters/tests/parameters.tftest.hcl (1)
23-421: Consider consolidating path-format and type assertions into singlerunblocks per parameter.Every parameter currently uses two separate
runblocks (one for path format, one for type). Since arunblock supports multipleassertblocks, you could combine each pair into onerun, cutting the number of test runs roughly in half and improving readability without losing any coverage. This mirrors howtest_iam_role_policy_attachments_linkin the frontend test file uses multiple asserts in one run.Example consolidation for one parameter
-run "test_django_algolia_application_id_path_format" { - command = plan - assert { - condition = aws_ssm_parameter.django_algolia_application_id.name == "/${var.project_name}/${var.environment}/DJANGO_ALGOLIA_APPLICATION_ID" - error_message = "DJANGO_ALGOLIA_APPLICATION_ID must follow path: /{project}/{environment}/DJANGO_ALGOLIA_APPLICATION_ID." - } -} - -run "test_django_algolia_application_id_is_secure_string" { - command = plan - assert { - condition = aws_ssm_parameter.django_algolia_application_id.type == "SecureString" - error_message = "DJANGO_ALGOLIA_APPLICATION_ID must be stored as SecureString." - } -} +run "test_django_algolia_application_id" { + command = plan + assert { + condition = aws_ssm_parameter.django_algolia_application_id.name == "/${var.project_name}/${var.environment}/DJANGO_ALGOLIA_APPLICATION_ID" + error_message = "DJANGO_ALGOLIA_APPLICATION_ID must follow path: /{project}/{environment}/DJANGO_ALGOLIA_APPLICATION_ID." + } + assert { + condition = aws_ssm_parameter.django_algolia_application_id.type == "SecureString" + error_message = "DJANGO_ALGOLIA_APPLICATION_ID must be stored as SecureString." + } +}infrastructure/modules/database/outputs.tf (1)
6-9: Description is misleading whencreate_rds_proxyis false.When
var.create_rds_proxyisfalse, this output returns the RDS instance address, not a proxy endpoint. Consider updating the description to reflect the conditional behavior.Suggested fix
output "db_proxy_endpoint" { - description = "The endpoint of the RDS proxy." + description = "The database connection endpoint (RDS Proxy when enabled, otherwise RDS instance address)." value = var.create_rds_proxy ? aws_db_proxy.main[0].endpoint : aws_db_instance.main.address }infrastructure/modules/cache/outputs.tf (1)
1-4: Minor grammar nit in description.Suggested fix
output "redis_password_arn" { - description = "The SSM Parameter ARN of password of Redis." + description = "The SSM Parameter ARN of the Redis password." value = aws_ssm_parameter.django_redis_password.arn }infrastructure/modules/ecs/outputs.tf (1)
11-14: Minor description inconsistency: "ECS tasks cluster" vs "ECS cluster".The frontend module's equivalent output (Line 12 of
frontend/outputs.tf) uses "The name of the ECS cluster." while this says "The name of the ECS tasks cluster." Consider aligning the wording for consistency across modules.Suggested fix
output "ecs_cluster_name" { - description = "The name of the ECS tasks cluster." + description = "The name of the ECS cluster." value = aws_ecs_cluster.main.name }infrastructure/modules/security/main.tf (1)
48-56: Frontend port 3000 is hardcoded in security group rules.The ALB module receives
frontend_portas a variable (set to 3000 in staging), but the security module hardcodes port 3000 in bothalb_to_frontendegress (Line 50/54) andfrontend_from_albingress (Line 99/103). Iffrontend_portever changes, these rules would silently become stale.Consider passing
frontend_portas a variable to the security module to keep these in sync.♻️ Suggested approach
Add a
frontend_portvariable to the security module and use it in both rules:resource "aws_security_group_rule" "alb_to_frontend" { description = "Allow traffic to frontend ECS tasks" - from_port = 3000 + from_port = var.frontend_port protocol = "tcp" security_group_id = aws_security_group.alb.id source_security_group_id = aws_security_group.frontend.id - to_port = 3000 + to_port = var.frontend_port type = "egress" }Apply the same to
frontend_from_alb.Also applies to: 97-105
infrastructure/modules/networking/tests/networking.tftest.hcl (1)
244-263: Good positive and negative testing for VPC endpoint module creation.The default-disabled test ensures no unnecessary resources are created, while the
create_vpc_s3_endpoint = trueoverride validates that enabling any single endpoint flag triggers module creation. Consider adding a test that enables multiple endpoints simultaneously to verify there's still only one module instance.infrastructure/backend/main.tf (1)
195-211: Consider using KMS encryption for the state bucket instead of AES256.The DynamoDB lock table uses the customer-managed KMS key from
module.kms(line 93), but both S3 buckets fall back toAES256(SSE-S3). Since the state bucket holds sensitive Terraform state (potentially containing secrets and resource identifiers), using the same KMS key would provide consistent encryption posture and key management controls.♻️ Proposed fix for state bucket encryption
resource "aws_s3_bucket_server_side_encryption_configuration" "state" { bucket = aws_s3_bucket.state.id rule { apply_server_side_encryption_by_default { - sse_algorithm = "AES256" + sse_algorithm = "aws:kms" + kms_master_key_id = module.kms.key_arn } + bucket_key_enabled = true } }infrastructure/modules/ecs/modules/task/main.tf (1)
37-37: Consider parameterizing the container name.The container name is hardcoded to
"backend", but this task module is reused for multiple task types (sync_data, migrate, index_data, etc.). Usingvar.task_name(or a dedicated variable) would make the module more self-describing and easier to target for container overrides inRunTaskAPI calls.♻️ Suggested change
- name = "backend" + name = var.task_nameinfrastructure/modules/networking/modules/nacl/main.tf (1)
38-46: Private ephemeral inbound from0.0.0.0/0is intentionally broad — confirm this is expected.Rule 100 allows inbound TCP 1024–65535 from
0.0.0.0/0on the private NACL. This is standard for accepting return traffic through a NAT Gateway, but it also admits unsolicited traffic on ephemeral ports from any source that somehow reaches the private subnet. Since NACLs are stateless, this is the typical trade-off. Just flagging for awareness — security groups provide the stateful second layer.infrastructure/modules/alb/outputs.tf (1)
31-35:depends_onon output block — valid but worth a brief inline comment.The
depends_on = [aws_lb_listener.https]ensures the HTTPS listener exists before downstream modules consume the target group ARN. This is a correct ordering safeguard. Consider adding a brief comment explaining why this dependency exists, sincedepends_onon outputs is uncommon and might surprise future maintainers.infrastructure/modules/frontend/variables.tf (1)
51-55:image_tagdefaults to"latest"— acceptable for testing but worth noting.For a testing environment this is fine. If this module is ever promoted to production, pinning to a specific tag would be important to ensure reproducible deployments.
infrastructure/modules/alb/variables.tf (1)
45-49: Minor naming inconsistency:log_retention_daysvslog_retention_in_daysin other modules.The networking module and frontend module use
log_retention_in_days, while this ALB module useslog_retention_days. Consider aligning the naming for consistency across the infrastructure modules.infrastructure/modules/networking/modules/vpc-endpoint/main.tf (1)
37-122: Interface endpoints are consistently structured — well-factored repetitive resources.All five interface endpoints follow the same pattern (private DNS, shared SG, private subnets, consistent tagging). If the number of endpoints grows further, you could consider using
for_eachover a map of service names to reduce repetition, but at five endpoints the current approach is clear and maintainable.infrastructure/modules/frontend/main.tf (1)
109-136: Consider addingdeployment_circuit_breakerto the ECS service.Without a circuit breaker, a bad deployment can leave the service in a degraded state with ECS continuously trying to place failing tasks. Enabling it with rollback provides automatic recovery.
♻️ Suggested addition
resource "aws_ecs_service" "frontend" { cluster = aws_ecs_cluster.frontend.id desired_count = var.desired_count health_check_grace_period_seconds = 60 name = "${var.project_name}-${var.environment}-frontend-service" ... task_definition = aws_ecs_task_definition.frontend.arn + deployment_circuit_breaker { + enable = true + rollback = true + } + capacity_provider_strategy {infrastructure/modules/alb/main.tf (2)
83-101:enable_deletion_protectionis hardcoded tofalse.If this module is intended for reuse across staging and production, consider parameterizing this. Accidental deletion of a production ALB could cause an outage.
♻️ Suggested change
- enable_deletion_protection = false + enable_deletion_protection = var.enable_deletion_protectionAdd to
variables.tf:variable "enable_deletion_protection" { description = "Whether to enable deletion protection on the ALB." type = bool default = false }
134-149: Listener rule priority useseach.keywhich is a string, not an integer.The
for_eachmap uses string keys (fromfor idx, chunk). While Terraform will implicitly convert the string index to a number in100 + each.key, this relies on implicit type coercion. The expression works correctly becauseidxproduces numeric strings ("0", "1", "2"), but it would be clearer to usetonumber(each.key).♻️ Minor clarification
- priority = 100 + each.key + priority = 100 + tonumber(each.key).github/workflows/run-ci-cd.yaml (3)
537-545: Long-lived AWS access keys used instead of OIDC.The workflow already uses OIDC (
id-token: write) for Codecov in other jobs. Using OIDC withaws-actions/configure-aws-credentialsfor AWS access would eliminate the need for long-livedAWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEYsecrets, reducing the blast radius if credentials are compromised. This applies to all three jobs that configure AWS credentials (build-staging-images, plan-staging-nest, deploy-staging-nest).
818-826: Plan artifact may become stale between plan and apply jobs.If infrastructure state changes between the
plan-staging-nestanddeploy-staging-nestjobs (e.g., manual changes, concurrent runs),terraform apply tfplanwill fail because the saved plan's state serial won't match. This is actually the correct behavior — Terraform will refuse to apply a stale plan. Just be aware that this can cause deploy failures requiring a re-run of the pipeline.Also, note that
-auto-approveis redundant when applying a saved plan file.
857-870: Zappa deployment depends on Terraform outputs but has no rollback ifzappa updatefails.If
zappa update staging(line 870) fails mid-deploy, the Terraform infrastructure will already be applied but the Lambda function may be in an inconsistent state. Consider adding error handling or a rollback step, or at minimum ensuringzappa updateis idempotent and safe to re-run.infrastructure/modules/networking/main.tf (1)
60-67: Single NAT gateway — acceptable for staging, worth noting for production.The NAT gateway is pinned to
aws_subnet.public[0], meaning all private-subnet egress routes through a single AZ. This is cost-effective for staging but creates a single point of failure. If this module is reused for production, consider provisioning one NAT gateway per AZ for high availability.infrastructure/modules/parameters/main.tf (1)
16-26: SecureString parameters use AWS-managed KMS — track CMK migration.Checkov flags CKV_AWS_337 on all
SecureStringparameters because they rely on the default AWS-managedaws/ssmkey rather than a customer-managed KMS CMK. Based on learnings, KMS CMK encryption is planned after S3 state management is completed. Leaving this as a tracking note — no action needed now, but ensure this is on the roadmap.Also applies to: 109-119, 149-159, 162-172, 174-184, 218-228
infrastructure/staging/variables.tf (1)
1-5: No validation that AZ count matches subnet CIDR counts.The networking module indexes into
availability_zones,public_subnet_cidrs, andprivate_subnet_cidrsbycount.index(e.g.,var.availability_zones[count.index]). If the lists have mismatched lengths, Terraform will fail with an index-out-of-range error at plan time. Consider adding a cross-variable validation.🛡️ Example validation (add to any one of the three variables)
variable "availability_zones" { description = "A list of availability zones for the VPC." type = list(string) default = ["us-east-2a", "us-east-2b", "us-east-2c"] validation { condition = length(var.availability_zones) > 0 error_message = "At least one availability zone must be specified." } }Note: Terraform variable-level validations cannot cross-reference other variables. The best place for a cross-variable check is a
checkblock or apreconditionin the module call inmain.tf:locals { az_count = length(var.availability_zones) } check "subnet_az_count_match" { assert { condition = length(var.public_subnet_cidrs) == local.az_count && length(var.private_subnet_cidrs) == local.az_count error_message = "The number of public/private subnet CIDRs must match the number of availability zones." } }Also applies to: 201-231
| - name: Show plan in summary | ||
| working-directory: infrastructure/staging | ||
| run: | | ||
| echo "## Terraform Plan Output" >> $GITHUB_STEP_SUMMARY | ||
| echo '```' >> $GITHUB_STEP_SUMMARY | ||
| terraform show -no-color tfplan >> $GITHUB_STEP_SUMMARY | ||
| echo '```' >> $GITHUB_STEP_SUMMARY |
There was a problem hiding this comment.
Terraform plan output in step summary may expose sensitive infrastructure details.
$GITHUB_STEP_SUMMARY is visible to anyone who can view the workflow run. For a public repository like OWASP/Nest, this could leak VPC IDs, subnet CIDRs, security group IDs, ARNs, and other infrastructure details. While Terraform masks values marked sensitive, many resource attributes are not marked sensitive yet could aid reconnaissance.
Consider either removing this step, restricting it to private visibility, or filtering the output.
🤖 Prompt for AI Agents
In @.github/workflows/run-ci-cd.yaml around lines 755 - 761, The "Show plan in
summary" step currently writes the raw terraform show -no-color tfplan into
$GITHUB_STEP_SUMMARY (working-directory: infrastructure/staging), which may leak
sensitive infra details; modify this by either removing the step entirely,
gating it so it only runs for private/internal runs, or replace the direct dump
with a filtered output: run terraform show -json tfplan and pipe through jq to
extract only non-sensitive summary fields (or redact VPC/subnet/security group
IDs) before appending to $GITHUB_STEP_SUMMARY, or alternatively upload the full
tfplan as a protected artifact instead of writing it to $GITHUB_STEP_SUMMARY.
| mock_model = MagicMock() | ||
| mock_app_config = MagicMock() | ||
| mock_app_config.get_models.return_value = [mock_model] |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Dead code: mock_model and mock_app_config are created but never used.
These mocks appear in test_handle (Lines 19-21) and test_handle_with_custom_fixture_path (Lines 49-51). Remove them to avoid confusion.
Proposed fix (test_handle)
):
- mock_model = MagicMock()
- mock_app_config = MagicMock()
- mock_app_config.get_models.return_value = [mock_model]
-
mock_atomic.return_value.__enter__ = MagicMock()📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| mock_model = MagicMock() | |
| mock_app_config = MagicMock() | |
| mock_app_config.get_models.return_value = [mock_model] |
🤖 Prompt for AI Agents
In `@backend/tests/apps/common/management/commands/load_data_test.py` around lines
19 - 21, Remove the unused MagicMock instances `mock_model` and
`mock_app_config` (and the `mock_app_config.get_models.return_value =
[mock_model]` line) from the tests `test_handle` and
`test_handle_with_custom_fixture_path`; locate the variable assignments to
`mock_model` and `mock_app_config` in those test functions and delete them so
the tests no longer define dead variables.
| def test_handle_with_exception_during_call_command( | ||
| self, | ||
| mock_atomic, | ||
| mock_call_command, | ||
| mock_register, | ||
| mock_unregister, | ||
| ): | ||
| """Test that indexing is re-enabled even if call_command fails.""" | ||
| mock_call_command.side_effect = Exception("Call command failed") | ||
|
|
||
| with patch("contextlib.suppress") as mock_suppress: | ||
| mock_suppress.return_value.__enter__ = MagicMock() | ||
| mock_suppress.return_value.__exit__ = MagicMock() | ||
|
|
||
| with contextlib.suppress(Exception): | ||
| call_command("load_data") | ||
|
|
||
| mock_unregister.assert_called_once() | ||
| mock_register.assert_called_once() | ||
|
|
||
| mock_atomic.assert_called_once() |
There was a problem hiding this comment.
Exception test is fragile — patching contextlib.suppress undermines its own usage.
Line 84 replaces contextlib.suppress with a mock, then Line 88 calls contextlib.suppress(Exception) — which now invokes the mock, not the real suppressor. The exception is only suppressed because MagicMock().__exit__() returns a truthy MagicMock, which accidentally signals "suppress the exception." This is brittle and misleading.
Use pytest.raises instead for a straightforward, intention-revealing test:
Proposed fix
def test_handle_with_exception_during_call_command(
self,
mock_atomic,
mock_call_command,
mock_register,
mock_unregister,
):
"""Test that indexing is re-enabled even if call_command fails."""
+ mock_atomic.return_value.__enter__ = MagicMock()
+ mock_atomic.return_value.__exit__ = MagicMock()
mock_call_command.side_effect = Exception("Call command failed")
- with patch("contextlib.suppress") as mock_suppress:
- mock_suppress.return_value.__enter__ = MagicMock()
- mock_suppress.return_value.__exit__ = MagicMock()
-
- with contextlib.suppress(Exception):
- call_command("load_data")
+ with pytest.raises(Exception, match="Call command failed"):
+ call_command("load_data")
mock_unregister.assert_called_once()
mock_register.assert_called_once()
-
mock_atomic.assert_called_once()Note: you'll also need import pytest at the top of the file, and can remove import contextlib since it would no longer be used.
🤖 Prompt for AI Agents
In `@backend/tests/apps/common/management/commands/load_data_test.py` around lines
74 - 94, The test test_handle_with_exception_during_call_command incorrectly
patches contextlib.suppress (making the suppression accidental); instead remove
the patch of contextlib.suppress and replace the with-context using
pytest.raises(Exception): call call_command("load_data") inside pytest.raises to
assert the exception is raised, keep mock_call_command.side_effect =
Exception(...), add import pytest at the top and remove the now-unused import
contextlib, and retain the existing assertions that
mock_unregister.assert_called_once(), mock_register.assert_called_once(), and
mock_atomic.assert_called_once().
| for filepath in temp_path.rglob("*"): | ||
| if filepath == new_archive_path: | ||
| continue | ||
| if filepath.is_file() and not any(re.search(p, str(filepath)) for p in excludes): | ||
| tf.add(filepath, filepath.relative_to(temp_path)) |
There was a problem hiding this comment.
Bug: Regex patterns are matched against the full temp directory path, not the archive-relative path.
str(filepath) includes the temp directory prefix (e.g., /tmp/abc123/some/file.py), but regex_excludes patterns are presumably written for archive-relative paths. This will cause patterns to either fail to match or match unintended paths.
Use the relative path instead:
Suggested fix
for filepath in temp_path.rglob("*"):
if filepath == new_archive_path:
continue
- if filepath.is_file() and not any(re.search(p, str(filepath)) for p in excludes):
+ relative = str(filepath.relative_to(temp_path))
+ if filepath.is_file() and not any(re.search(p, relative) for p in excludes):
tf.add(filepath, filepath.relative_to(temp_path))📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| for filepath in temp_path.rglob("*"): | |
| if filepath == new_archive_path: | |
| continue | |
| if filepath.is_file() and not any(re.search(p, str(filepath)) for p in excludes): | |
| tf.add(filepath, filepath.relative_to(temp_path)) | |
| for filepath in temp_path.rglob("*"): | |
| if filepath == new_archive_path: | |
| continue | |
| relative = str(filepath.relative_to(temp_path)) | |
| if filepath.is_file() and not any(re.search(p, relative) for p in excludes): | |
| tf.add(filepath, filepath.relative_to(temp_path)) |
🤖 Prompt for AI Agents
In `@backend/zappa_callback.py` around lines 39 - 43, The regex excludes are being
tested against the full filesystem path (str(filepath)) instead of the
archive-relative path, so update the exclude check in the loop to compute the
relative path once (e.g., rel = filepath.relative_to(temp_path) and use str(rel)
or rel.as_posix()) and run the any(re.search(p, str(rel)) for p in excludes)
against that relative path before calling tf.add(filepath, rel); keep the
existing new_archive_path comparison as-is.
| client = boto3.client("lambda") | ||
| versions = client.list_versions_by_function(FunctionName=zappa.lambda_name)["Versions"] | ||
|
|
||
| if not (published := [v["Version"] for v in versions if v["Version"] != "$LATEST"]): | ||
| print("No published versions found, skipping alias update") | ||
| return |
There was a problem hiding this comment.
list_versions_by_function is paginated — this may miss versions.
The AWS Lambda list_versions_by_function API returns at most 50 versions per call. Both update_alias (Line 56) and cleanup_versions (Line 76) use a single call without handling pagination. If the function accumulates more than 50 versions, update_alias may not find the actual latest version, and cleanup_versions will not delete all old versions.
Consider using a paginator:
Suggested approach
- versions = client.list_versions_by_function(FunctionName=zappa.lambda_name)["Versions"]
+ paginator = client.get_paginator("list_versions_by_function")
+ versions = [
+ v
+ for page in paginator.paginate(FunctionName=zappa.lambda_name)
+ for v in page["Versions"]
+ ]Apply the same pattern in cleanup_versions.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| client = boto3.client("lambda") | |
| versions = client.list_versions_by_function(FunctionName=zappa.lambda_name)["Versions"] | |
| if not (published := [v["Version"] for v in versions if v["Version"] != "$LATEST"]): | |
| print("No published versions found, skipping alias update") | |
| return | |
| client = boto3.client("lambda") | |
| paginator = client.get_paginator("list_versions_by_function") | |
| versions = [ | |
| v | |
| for page in paginator.paginate(FunctionName=zappa.lambda_name) | |
| for v in page["Versions"] | |
| ] | |
| if not (published := [v["Version"] for v in versions if v["Version"] != "$LATEST"]): | |
| print("No published versions found, skipping alias update") | |
| return |
🤖 Prompt for AI Agents
In `@backend/zappa_callback.py` around lines 55 - 60, The current single call to
boto3.client("lambda").list_versions_by_function may miss versions due to AWS
pagination; replace it with a paginator
(client.get_paginator("list_versions_by_function")) to iterate all pages and
accumulate Versions before computing the published list (the code that builds
published := [v["Version"] ...] used by update_alias) and apply the same
paginated collection pattern inside cleanup_versions so both update_alias and
cleanup_versions operate on the full set of versions rather than only the first
page.
| resource "aws_ssm_parameter" "django_secret_key" { | ||
| description = "Django Secret Key generated by Terraform." | ||
| name = "/${var.project_name}/${var.environment}/DJANGO_SECRET_KEY" | ||
| tags = var.common_tags | ||
| type = "SecureString" | ||
| value = random_string.django_secret_key.result | ||
| } |
There was a problem hiding this comment.
Generated secrets lack lifecycle { ignore_changes } — recreation risk.
Unlike the manually-set parameters, django_secret_key and nextauth_secret SSM parameters are bound directly to random_password/random_string resources. If those resources are ever tainted or recreated (e.g., state manipulation, provider upgrade, terraform taint), the secret values will change silently, invalidating all active Django sessions and NextAuth tokens.
Consider adding lifecycle { ignore_changes = [value] } to these two SSM parameters so that once the initial secret is seeded, Terraform won't overwrite it. Rotation can then be handled deliberately.
🛡️ Proposed fix
resource "aws_ssm_parameter" "django_secret_key" {
description = "Django Secret Key generated by Terraform."
name = "/${var.project_name}/${var.environment}/DJANGO_SECRET_KEY"
tags = var.common_tags
type = "SecureString"
value = random_string.django_secret_key.result
+
+ lifecycle {
+ ignore_changes = [value]
+ }
} resource "aws_ssm_parameter" "nextauth_secret" {
description = "NextAuth secret key generated by Terraform."
name = "/${var.project_name}/${var.environment}/NEXTAUTH_SECRET"
tags = var.common_tags
type = "SecureString"
value = random_string.nextauth_secret.result
+
+ lifecycle {
+ ignore_changes = [value]
+ }
}Also applies to: 238-244
🧰 Tools
🪛 Checkov (3.2.334)
[high] 129-135: Ensure SSM parameters are using KMS CMK
(CKV_AWS_337)
🤖 Prompt for AI Agents
In `@infrastructure/modules/parameters/main.tf` around lines 129 - 135, The
aws_ssm_parameter resources for the generated secrets (e.g.,
aws_ssm_parameter.django_secret_key and aws_ssm_parameter.nextauth_secret)
currently track the value from random_string/random_password and will be
overwritten if those resources are recreated; add a lifecycle block to each SSM
parameter with ignore_changes = [value] so Terraform will not update the stored
secret after initial creation (allowing deliberate rotation instead). Locate the
aws_ssm_parameter resource blocks for django_secret_key and nextauth_secret and
add lifecycle { ignore_changes = [value] } inside each resource.
| resource "random_string" "django_secret_key" { | ||
| length = 50 | ||
| special = true | ||
| } | ||
|
|
||
| resource "random_string" "nextauth_secret" { | ||
| length = 32 | ||
| special = true | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Terraform random_password result sensitive attribute vs random_string
💡 Result:
random_passwordvsrandom_string: They generate values the same way (both use a cryptographic RNG), butrandom_password.resultis marked sensitive, while **random_string.resultis not. In practice that meansrandom_passwordwon’t be shown inplan/applyoutput, whereasrandom_stringwill. [1]- What “sensitive” actually does: It only affects UI/CLI redaction (e.g.,
terraform output nameshows<sensitive>). The real value is still stored in Terraform state. [2] - How to retrieve a sensitive
random_passwordvalue when needed: If you expose it via anoutput,terraform output -raw <name>orterraform output -jsonwill print the plaintext value. [2] - If your goal is “don’t store it in state”: Prefer ephemeral patterns (e.g., ephemeral outputs / the ephemeral
random_passwordvariant when possible) because “sensitive” alone does not prevent state storage. [1] [2] - Debug-only unredaction: You can wrap a sensitive value with
nonsensitive(...)to display it in plan/output, but this is for troubleshooting and doesn’t make it safer. [3]
Sources:
[1] terraform-provider-random random_password doc/code (sensitive result; identical to random_string) (third-party-mirror.googlesource.com)
[2] HashiCorp Terraform docs: sensitive outputs, state storage, -json/-raw, and ephemeral outputs (developer.hashicorp.com)
[3] HashiCorp support article on using nonsensitive (support.hashicorp.com)
Citations:
- 1: https://third-party-mirror.googlesource.com/terraform-provider-random/%2Bshow/1f8584c3dffeaeedb5c15071ea990d110f4fca27/docs/resources/password.md?utm_source=openai
- 2: https://developer.hashicorp.com/terraform/language/values/outputs
- 3: https://support.hashicorp.com/hc/en-us/articles/46913421636499-Check-the-value-of-sensitive-attributes-in-terraform-plan?utm_source=openai
🏁 Script executed:
# Check the actual file at the specified lines and find references
head -300 infrastructure/modules/parameters/main.tf | tail -50Repository: OWASP/Nest
Length of output: 1548
🏁 Script executed:
# Search for references to the random_string/random_password resources
grep -n "random_string\|random_password" infrastructure/modules/parameters/main.tfRepository: OWASP/Nest
Length of output: 269
Use random_password instead of random_string for secret generation.
random_string outputs are not marked as sensitive, so they appear in plaintext in terraform plan, terraform apply output, and logs. random_password behaves identically but marks its result attribute as sensitive, preventing accidental exposure in CLI output and logs.
Update lines 258–266:
Proposed fix
-resource "random_string" "django_secret_key" {
+resource "random_password" "django_secret_key" {
length = 50
special = true
}
-resource "random_string" "nextauth_secret" {
+resource "random_password" "nextauth_secret" {
length = 32
special = true
}Also update the references on line 134 and line 243:
- value = random_string.django_secret_key.result
+ value = random_password.django_secret_key.result- value = random_string.nextauth_secret.result
+ value = random_password.nextauth_secret.result🤖 Prompt for AI Agents
In `@infrastructure/modules/parameters/main.tf` around lines 258 - 266, Replace
the two random_string resources with random_password resources so the secrets
are marked sensitive: change resource "random_string" "django_secret_key" and
resource "random_string" "nextauth_secret" to resource "random_password"
"django_secret_key" and resource "random_password" "nextauth_secret" (keeping
length and special attributes), and update any references elsewhere that use
random_string.django_secret_key or random_string.nextauth_secret to
random_password.django_secret_key and random_password.nextauth_secret so the
.result values remain the same but are treated as sensitive.
| resource "aws_security_group_rule" "ecs_to_vpc_endpoints" { | ||
| count = var.create_vpc_endpoint_rules ? 1 : 0 | ||
| description = "Allow HTTPS to VPC endpoints" | ||
| from_port = 443 | ||
| protocol = "tcp" | ||
| security_group_id = aws_security_group.ecs.id | ||
| source_security_group_id = var.vpc_endpoint_sg_id | ||
| to_port = 443 | ||
| type = "egress" | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Explore the database module structure
fd "main.tf" --full-path --glob '**/modules/database/**' --type fRepository: OWASP/Nest
Length of output: 163
🏁 Script executed:
# Check for RDS Proxy configuration and Secrets Manager references
rg -i "rds.*proxy|db.*proxy" infrastructure/modules/database/ -A 5Repository: OWASP/Nest
Length of output: 9730
🏁 Script executed:
# Search for Secrets Manager usage in the codebase
rg -i "secrets.*manager|secretsmanager" infrastructure/ -lRepository: OWASP/Nest
Length of output: 572
🏁 Script executed:
cat -n infrastructure/modules/security/main.tf | head -200Repository: OWASP/Nest
Length of output: 8202
🏁 Script executed:
# Check if rds_proxy security group is created in security module
rg -i "rds_proxy\|rds.*proxy" infrastructure/modules/security/Repository: OWASP/Nest
Length of output: 36
🏁 Script executed:
# Look for VPC endpoint configuration and Secrets Manager endpoint
cat -n infrastructure/modules/networking/modules/vpc-endpoint/main.tfRepository: OWASP/Nest
Length of output: 5074
🏁 Script executed:
cat -n infrastructure/modules/security/main.tf | tail -100Repository: OWASP/Nest
Length of output: 4605
🏁 Script executed:
# Search for all egress rules related to rds_proxy
rg "rds_proxy" infrastructure/modules/security/main.tf -A 5 -B 5Repository: OWASP/Nest
Length of output: 3152
Add VPC endpoint egress rule for RDS Proxy security group.
When RDS Proxy is enabled with Secrets Manager VPC endpoint, it needs HTTPS egress to the VPC endpoint security group to fetch DB credentials. Currently, only the ECS, frontend, and Lambda security groups have this rule. Add a matching egress rule for the RDS Proxy security group:
Example
resource "aws_security_group_rule" "rds_proxy_to_vpc_endpoints" {
count = var.create_rds_proxy && var.create_vpc_endpoint_rules ? 1 : 0
description = "Allow HTTPS to VPC endpoints"
from_port = 443
protocol = "tcp"
security_group_id = aws_security_group.rds_proxy[0].id
source_security_group_id = var.vpc_endpoint_sg_id
to_port = 443
type = "egress"
}
Without this rule, RDS Proxy authentication will fail when trying to retrieve secrets from Secrets Manager via VPC endpoint.
🤖 Prompt for AI Agents
In `@infrastructure/modules/security/main.tf` around lines 77 - 86, Add an egress
security group rule allowing HTTPS from the RDS Proxy security group to the VPC
endpoint SG: create a new resource named
aws_security_group_rule.rds_proxy_to_vpc_endpoints that is conditional on
var.create_rds_proxy && var.create_vpc_endpoint_rules, with description "Allow
HTTPS to VPC endpoints", type "egress", protocol "tcp", from_port/to_port 443,
security_group_id set to aws_security_group.rds_proxy[0].id and
source_security_group_id set to var.vpc_endpoint_sg_id so the RDS Proxy can
reach Secrets Manager via the VPC endpoint.
| resource "aws_iam_policy" "fixtures_read_only" { | ||
| name = "${var.project_name}-${var.environment}-fixtures-read-only" | ||
| description = "Allows read-only access to the fixtures S3 bucket" | ||
| policy = jsonencode({ | ||
| Version = "2012-10-17" | ||
| Statement = [{ | ||
| Effect = "Allow" | ||
| Action = ["s3:GetObject"] | ||
| Resource = ["arn:aws:s3:::${var.fixtures_bucket_name}-${random_id.suffix.hex}/*"] | ||
| }] | ||
| }) | ||
| tags = var.common_tags | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
fd 'outputs.tf' --full-path --glob '**/storage/modules/s3-bucket/**'Repository: OWASP/Nest
Length of output: 172
🏁 Script executed:
# Find the s3-bucket module and check its structure
find . -type d -path '*/storage/modules/s3-bucket' 2>/dev/nullRepository: OWASP/Nest
Length of output: 104
🏁 Script executed:
# List the main storage module to understand its structure
ls -la infrastructure/modules/storage/ 2>/dev/null || ls -la .Repository: OWASP/Nest
Length of output: 533
🏁 Script executed:
cat infrastructure/modules/storage/modules/s3-bucket/outputs.tfRepository: OWASP/Nest
Length of output: 251
🏁 Script executed:
cat infrastructure/modules/storage/modules/s3-bucket/main.tfRepository: OWASP/Nest
Length of output: 1803
🏁 Script executed:
cat infrastructure/modules/storage/main.tfRepository: OWASP/Nest
Length of output: 1253
Consider referencing the module output for the bucket ARN instead of reconstructing it.
Line 36 manually rebuilds the bucket name (${var.fixtures_bucket_name}-${random_id.suffix.hex}) to construct the ARN, duplicating the logic from line 22. If the s3-bucket submodule ever changes its naming, this policy resource will silently drift. Use the module's arn output instead.
♻️ Proposed fix
policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Effect = "Allow"
Action = ["s3:GetObject"]
- Resource = ["arn:aws:s3:::${var.fixtures_bucket_name}-${random_id.suffix.hex}/*"]
+ Resource = ["${module.fixtures_bucket.arn}/*"]
}]
})🤖 Prompt for AI Agents
In `@infrastructure/modules/storage/main.tf` around lines 28 - 40, The IAM policy
resource aws_iam_policy.fixtures_read_only is reconstructing the bucket ARN
using ${var.fixtures_bucket_name}-${random_id.suffix.hex} which duplicates
naming logic and can drift; replace the hardcoded Resource value with the S3
module's ARN output (e.g., use module.<s3_module_name>.arn or
module.fixtures_bucket.arn) so the policy references the canonical bucket ARN
output from the s3-bucket submodule instead of rebuilding it from
var.fixtures_bucket_name and random_id.suffix.hex.
| > [!NOTE] | ||
| > Copy the state bucket name from the output. | ||
|
|
||
| > [!NOTE] | ||
| > It is recommended to not destroy the backend resources unless absolutely necessary. | ||
|
|
There was a problem hiding this comment.
Consecutive blockquotes separated by blank lines trigger MD028.
Markdownlint flags blank lines between adjacent blockquotes (Lines 43, 81, 178). This is cosmetic and only affects strict linting — GitHub renders these correctly. If you want to silence the warning, combine adjacent notes into a single blockquote block or add an <!-- markdownlint-disable MD028 --> directive.
🧰 Tools
🪛 markdownlint-cli2 (0.20.0)
[warning] 43-43: Blank line inside blockquote
(MD028, no-blanks-blockquote)
🤖 Prompt for AI Agents
In `@infrastructure/README.md` around lines 41 - 46, Two adjacent blockquote notes
("Copy the state bucket name from the output." and "It is recommended to not
destroy the backend resources unless absolutely necessary.") are separated by a
blank line which triggers MD028; fix by either merging the consecutive
blockquotes into a single blockquote block (combine their lines so they appear
in one contiguous > paragraph) or add a comment directive <!--
markdownlint-disable MD028 --> above the first blockquote and re-enable it after
if preferred; update the README.md sections containing those quoted lines
accordingly.
| """Load data into the OWASP Nest application.""" | ||
| with index.disable_indexing(), transaction.atomic(): | ||
| # Run loaddata | ||
| call_command("loaddata", _options["fixture_path"], "-v", "3") |
There was a problem hiding this comment.
this seems to have been migrated to a psql based command. I will update this and the infrastructure to use that in a new PR.
…-migration-sync Sync feature/nest-zappa-migraiton With main
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backend/settings/base.py (1)
156-164:⚠️ Potential issue | 🟠 Major
RQ_QUEUESmissing TLS configuration will cause connection failures to Redis if TLS is enforced.The cache
LOCATIONusesrediss://(TLS-enabled) whenREDIS_AUTH_ENABLEDis true, butRQ_QUEUESconnects to the sameREDIS_HOSTvia plainHOST/PORTwithout SSL settings. When the Redis instance enforces encryption in transit, RQ worker connections will be rejected.Update
RQ_QUEUESto mirror the TLS logic applied to the cache configuration:Proposed fix
RQ_QUEUES = { "ai": { - "HOST": REDIS_HOST, - "PORT": 6379, - "PASSWORD": REDIS_PASSWORD, - "DB": 1, - "DEFAULT_TIMEOUT": 300, + "URL": f"rediss://:{REDIS_PASSWORD}@{REDIS_HOST}:6379/1" + if REDIS_AUTH_ENABLED + else f"redis://{REDIS_HOST}:6379/1", + "DEFAULT_TIMEOUT": 300, } }
🤖 Fix all issues with AI agents
In @.github/workflows/run-ci-cd.yaml:
- Around line 765-767: The current echo lines hardcode indices for
private_subnet_a/private_subnet_b/private_subnet_c using jq -r
'.[0]'/'[1]'/'[2]' which fails if terraform output has fewer than 3 subnets;
change this to either (a) validate the count from terraform output (e.g.,
capture terraform output -json private_subnet_ids into a variable, use jq
'length' and fail with a clear error if < 3) or (b) dynamically iterate the
array and emit outputs for each entry (e.g., use jq -r '.[]' and a shell loop to
index and write "private_subnet_<index>=<id>" to $GITHUB_OUTPUT) so downstream
ECS run-task and Zappa receive only valid subnet IDs; update places that expect
fixed names if you choose dynamic naming.
- Around line 700-873: The deploy-staging-nest job lacks a timeout-minutes
setting which can let long-running steps (Terraform Apply, Zappa update, ECS
task waits such as Run ECS migrate task and Wait for frontend service stability)
consume runner time; add a timeout-minutes entry to the deploy-staging-nest job
(e.g. timeout-minutes: 60 or an appropriate value for staging) directly under
the job definition (deploy-staging-nest) after the job-level keys and before
steps to ensure the entire job is bounded.
- Around line 797-799: The CI step named "Update Zappa deployment" is using
`zappa update staging`, which fails on initial deployments because the Lambda
doesn't exist yet; change the step to run `zappa deploy staging` (or add
conditional logic to check if the Zappa-managed Lambda exists and run `deploy`
for first-time/absent functions and `update` otherwise) — locate the workflow
step by the step name "Update Zappa deployment" (working-directory: backend) and
replace the command accordingly.
In `@infrastructure/backend/main.tf`:
- Around line 136-150: The lifecycle for the S3 logs bucket (resource
aws_s3_bucket_lifecycle_configuration "logs" targeting aws_s3_bucket.logs)
currently only uses expiration for current objects, so noncurrent object
versions will never be removed; update the lifecycle to add a
noncurrent_version_expiration block (e.g., days = var.noncurrent_expire_days or
a new variable) and/or noncurrent_version_transition rules as appropriate to
ensure older versions are purged, or else remove/disable versioning on
aws_s3_bucket.logs if retaining versions is not required; ensure you reference
the existing abort_incomplete_multipart_upload, expiration and variable names
(var.expire_log_days) when adding the noncurrent rules so the configuration is
consistent.
- Around line 73-95: The DynamoDB state_lock resource
(aws_dynamodb_table.state_lock) needs AWS-level deletion protection and the KMS
key must allow DynamoDB to use it: add deletion_protection_enabled = true to the
aws_dynamodb_table.state_lock resource, and update the KMS module key policy (in
infrastructure/modules/kms main policy referenced by module.kms.key_arn) to
include a statement granting DynamoDB the actions kms:Encrypt, kms:Decrypt,
kms:DescribeKey, kms:GenerateDataKey*, kms:ReEncrypt*, kms:CreateGrant with a
Condition kms:ViaService set to dynamodb.<region>.amazonaws.com (or wildcard for
region) so DynamoDB can use the key for server_side_encryption; ensure the
principal and resource fields match the key policy structure and include the
account root as needed.
In `@infrastructure/README.md`:
- Around line 245-251: The README's S3 upload step uses an unexplained
placeholder `owasp-nest-fixtures-<id>`; add a short NOTE near that command
clarifying that `<id>` is the unique bucket suffix from the Terraform output
`fixtures_bucket_name` (or other source) and instruct the user to replace it
with that exact value; reference the command that mentions `data/nest.json.gz`
and the bucket name `owasp-nest-fixtures-<id>` so reviewers can find where to
insert the NOTE.
🧹 Nitpick comments (12)
infrastructure/README.md (3)
61-65: Prefercpovercat >for copying files.The command
cat terraform.tfvars.example > terraform.tfvarsworks but is less idiomatic thancp terraform.tfvars.example terraform.tfvars. The standard copy command is clearer and more direct.📝 Suggested simplification
- - Copy the contents from the example file: - - ```bash - cat terraform.tfvars.example > terraform.tfvars - ``` + - Copy the example file: + + ```bash + cp terraform.tfvars.example terraform.tfvars + ```
73-77: Prefercpovercat >for copying files.Same as the previous comment—using
cpis more idiomatic thancat >for file copying.📝 Suggested simplification
- - Copy the contents from the example file: - - ```bash - cat terraform.tfbackend.example > terraform.tfbackend - ``` + - Copy the example file: + + ```bash + cp terraform.tfbackend.example terraform.tfbackend + ```
132-136: Prefercpovercat >for copying files.Same pattern as earlier—
cpis more idiomatic thancat >for file copying.📝 Suggested simplification
- - Copy the contents from the template file into your new local environment file: - - ```bash - cat zappa_settings.template.json > zappa_settings.json - ``` + - Copy the template file: + + ```bash + cp zappa_settings.template.json zappa_settings.json + ```.github/workflows/run-ci-cd.yaml (1)
621-626: Consider using OIDC for AWS authentication instead of static access keys.Both
plan-staging-nestanddeploy-staging-nestuse long-livedAWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEYstatic credentials. GitHub Actions supports OIDC federation with AWS viarole-to-assume, which eliminates the need for stored secrets and provides short-lived credentials. This is a security best practice, especially for a public repository.Example OIDC configuration
- name: Configure AWS credentials uses: aws-actions/configure-aws-credentials@61815dcd50bd041e203e49132bacad1fd04d2708 with: - aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }} aws-region: ${{ vars.AWS_REGION }} - aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }} + role-to-assume: ${{ secrets.AWS_ROLE_ARN }}This requires setting up an IAM OIDC provider and role in AWS, and adding
id-token: writeto the job's permissions.Also applies to: 718-723
infrastructure/backend/main.tf (2)
23-35: Consider adding an HTTPS-only policy for the logs bucket as well.The state bucket correctly enforces HTTPS-only access via
state_https_only, but the logs bucket has no equivalent deny policy. IAM principals with access to the logs bucket could potentially interact with it over plain HTTP. For consistency and defense-in-depth, apply the sameSecureTransportdeny condition to the logs bucket.♻️ Suggested addition
+data "aws_iam_policy_document" "logs_https_only" { + policy_id = "ForceHTTPS" + + statement { + actions = ["s3:*"] + sid = "HTTPSOnly" + effect = "Deny" + + condition { + test = "Bool" + values = ["false"] + variable = "aws:SecureTransport" + } + principals { + identifiers = ["*"] + type = "AWS" + } + resources = [ + aws_s3_bucket.logs.arn, + "${aws_s3_bucket.logs.arn}/*", + ] + } +}Then combine both statements in the logs bucket policy (or use
source_policy_documents/override_policy_documentsto merge them):resource "aws_s3_bucket_policy" "logs" { bucket = aws_s3_bucket.logs.id - policy = data.aws_iam_policy_document.logs.json + policy = data.aws_iam_policy_document.logs_combined.json }
195-213: AES256 encryption is acknowledged viatrivy:ignore— consider upgrading the state bucket to KMS-CMK.The Trivy suppressions indicate this is intentional. AES256 for the logs bucket is reasonable since S3 server access logging can have compatibility issues with KMS-encrypted target buckets. However, the state bucket holds sensitive infrastructure state and would benefit from CMK encryption (matching the DynamoDB table's approach), with key rotation and audit trail via CloudTrail. This could be deferred if the KMS module is still being stabilized.
infrastructure/modules/ecs/main.tf (1)
14-28: ECS tasks cluster is missingcontainerInsights.The frontend module's cluster (Line 64-67 of
frontend/main.tf) explicitly enablescontainerInsights, but this tasks cluster does not. Without it, you lose ECS-level CPU/memory/task metrics in CloudWatch, making it harder to troubleshoot scheduled task failures or resource pressure.♻️ Proposed fix to enable Container Insights
resource "aws_ecs_cluster" "main" { name = "${var.project_name}-${var.environment}-tasks-cluster" tags = var.common_tags + + setting { + name = "containerInsights" + value = "enabled" + } }infrastructure/modules/frontend/main.tf (1)
109-136: Consider enabling a deployment circuit breaker for the frontend ECS service.The
aws_ecs_serviceresource does not configuredeployment_circuit_breaker. Without it, a broken image deployment can leave the service stuck in a rollout loop with failing tasks draining capacity. Enabling the circuit breaker with rollback provides a safety net.♻️ Suggested addition
resource "aws_ecs_service" "frontend" { cluster = aws_ecs_cluster.frontend.id desired_count = var.desired_count health_check_grace_period_seconds = 60 name = "${var.project_name}-${var.environment}-frontend-service" ... task_definition = aws_ecs_task_definition.frontend.arn + deployment_circuit_breaker { + enable = true + rollback = true + } + capacity_provider_strategy {infrastructure/modules/networking/main.tf (2)
60-67: Single NAT gateway creates a single point of failure across AZs.All private subnets route through one NAT gateway in
public[0]. If that AZ becomes unavailable, all private subnet egress (ECR pulls, DB connections via public endpoints, external API calls) will fail. For a staging environment this is a reasonable cost trade-off, but for production consider deploying one NAT gateway per AZ.
22-43: Subnet creation assumes matching array lengths for AZs and CIDRs.
count = length(var.public_subnet_cidrs)(Line 25) indexes intovar.availability_zones[count.index](Line 23). If the AZ list is shorter than the CIDR list, this will fail at plan time with an index-out-of-range error. Consider adding a validation block in the variables file to ensurelength(var.availability_zones) >= length(var.public_subnet_cidrs)and similarly for private subnets.#!/bin/bash # Check if there's any validation in the networking module's variables.tf fd -p 'modules/networking/variables.tf' --exec grep -A 10 'validation' {} \;infrastructure/modules/alb/main.tf (2)
135-150: Listener rule priority relies on implicit string-to-number conversion.
each.keyfrom thefor_eachmap is a string (e.g.,"0","1","2"), so100 + each.keyrelies on Terraform's implicit type coercion. While this works, an explicittonumber()call would improve clarity.♻️ Explicit type conversion
- priority = 100 + each.key + priority = 100 + tonumber(each.key)
83-102: ALB hasenable_deletion_protection = false.This is fine for staging/dev, but for production the ALB should have deletion protection enabled to prevent accidental
terraform destroy. Consider making this a variable so it can be toggled per environment.
| echo "private_subnet_a=$(terraform output -json private_subnet_ids | jq -r '.[0]')" >> $GITHUB_OUTPUT | ||
| echo "private_subnet_b=$(terraform output -json private_subnet_ids | jq -r '.[1]')" >> $GITHUB_OUTPUT | ||
| echo "private_subnet_c=$(terraform output -json private_subnet_ids | jq -r '.[2]')" >> $GITHUB_OUTPUT |
There was a problem hiding this comment.
Hardcoded subnet indices assume exactly 3 availability zones.
Lines 765–767 use jq -r '.[0]', '.[1]', '.[2]' to extract private subnets. If the Terraform output returns fewer than 3 subnets, these will resolve to null, causing downstream ECS run-task and Zappa configuration to receive invalid subnet IDs.
Consider either validating the count or dynamically iterating instead of hardcoding indices.
🤖 Prompt for AI Agents
In @.github/workflows/run-ci-cd.yaml around lines 765 - 767, The current echo
lines hardcode indices for private_subnet_a/private_subnet_b/private_subnet_c
using jq -r '.[0]'/'[1]'/'[2]' which fails if terraform output has fewer than 3
subnets; change this to either (a) validate the count from terraform output
(e.g., capture terraform output -json private_subnet_ids into a variable, use jq
'length' and fail with a clear error if < 3) or (b) dynamically iterate the
array and emit outputs for each entry (e.g., use jq -r '.[]' and a shell loop to
index and write "private_subnet_<index>=<id>" to $GITHUB_OUTPUT) so downstream
ECS run-task and Zappa receive only valid subnet IDs; update places that expect
fixed names if you choose dynamic naming.
| - name: Update Zappa deployment | ||
| working-directory: backend | ||
| run: poetry run zappa update staging |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Get the full workflow file to understand the deployment context around lines 797-799
wc -l .github/workflows/run-ci-cd.yamlRepository: OWASP/Nest
Length of output: 91
🏁 Script executed:
# Read the workflow around the mentioned lines to understand context
sed -n '700,850p' .github/workflows/run-ci-cd.yamlRepository: OWASP/Nest
Length of output: 6512
🏁 Script executed:
# Search for zappa deploy/update patterns in the workflow
rg -n "zappa deploy|zappa update" .github/workflows/run-ci-cd.yamlRepository: OWASP/Nest
Length of output: 102
🏁 Script executed:
# Check if there's any zappa configuration or initialization logic
find . -name "zappa*" -o -name ".zappa*" | head -20Repository: OWASP/Nest
Length of output: 120
🏁 Script executed:
# Look for any deploy vs update conditional logic in the repo
rg -n "if.*deploy\|if.*update" --type yaml .github/workflows/ -A 2 -B 2Repository: OWASP/Nest
Length of output: 36
🏁 Script executed:
# Check if Terraform creates any Lambda resources
find infrastructure -name "*.tf" | head -20Repository: OWASP/Nest
Length of output: 855
🏁 Script executed:
# Search for Lambda function definition in Terraform
rg -l "aws_lambda" infrastructure/Repository: OWASP/Nest
Length of output: 136
🏁 Script executed:
# Read Zappa settings template to understand configuration
cat backend/zappa_settings.template.jsonRepository: OWASP/Nest
Length of output: 1529
🏁 Script executed:
# Check if there's any zappa deploy in the entire workflow or infrastructure
rg -n "zappa deploy" .github/workflows/Repository: OWASP/Nest
Length of output: 36
🏁 Script executed:
# Check if Terraform actually creates any Lambda functions
rg -n "aws_lambda_function\|aws_lambda" infrastructure/Repository: OWASP/Nest
Length of output: 36
🏁 Script executed:
# Get a clearer picture of what's in the backend Terraform module
cat infrastructure/backend/main.tf | head -100Repository: OWASP/Nest
Length of output: 1999
🏁 Script executed:
# Check if there's any documentation or comments about deployment strategy
rg -n "zappa\|lambda\|deploy" README.md CONTRIBUTING.md 2>/dev/null | head -30Repository: OWASP/Nest
Length of output: 36
🏁 Script executed:
# Look for any other CI/CD workflows that might have initial deployment logic
ls -la .github/workflows/ | grep -i deployRepository: OWASP/Nest
Length of output: 36
🏁 Script executed:
# Check the staging infrastructure configuration
ls -la infrastructure/staging/Repository: OWASP/Nest
Length of output: 686
🏁 Script executed:
# Look at the main staging Terraform configuration
cat infrastructure/staging/main.tf | head -150Repository: OWASP/Nest
Length of output: 5935
🏁 Script executed:
# Check if any Terraform modules create Lambda functions
rg -r "aws_lambda_function" infrastructure/modules/Repository: OWASP/Nest
Length of output: 36
🏁 Script executed:
# Check git history to see if there's ever been a zappa deploy step
git log --oneline -n 50 -- .github/workflows/run-ci-cd.yaml 2>/dev/null | head -20Repository: OWASP/Nest
Length of output: 147
🏁 Script executed:
# Check how the ALB module uses the lambda_function_name
cat infrastructure/modules/alb/main.tf | grep -A 5 -B 5 "lambda_function_name"Repository: OWASP/Nest
Length of output: 2028
🏁 Script executed:
# Check if the ALB module creates a Lambda resource
rg -n "aws_lambda" infrastructure/modules/alb/main.tfRepository: OWASP/Nest
Length of output: 405
🏁 Script executed:
# Look at the recent commit that mentioned zappa migration
git show 7eea02fd --stat 2>/dev/null | head -50Repository: OWASP/Nest
Length of output: 2957
🏁 Script executed:
# Check the workflow history to see if there was ever a zappa deploy step before
git log --all --oneline -- .github/workflows/run-ci-cd.yaml 2>/dev/null | head -30Repository: OWASP/Nest
Length of output: 1804
zappa update will fail on initial deployment—use zappa deploy first.
The Lambda function is managed by Zappa, not Terraform. On first deployment (or if the function is deleted), zappa update staging assumes the Lambda already exists and will fail. Replace with zappa deploy staging, or add conditional logic to detect and handle the initial deployment scenario.
🤖 Prompt for AI Agents
In @.github/workflows/run-ci-cd.yaml around lines 797 - 799, The CI step named
"Update Zappa deployment" is using `zappa update staging`, which fails on
initial deployments because the Lambda doesn't exist yet; change the step to run
`zappa deploy staging` (or add conditional logic to check if the Zappa-managed
Lambda exists and run `deploy` for first-time/absent functions and `update`
otherwise) — locate the workflow step by the step name "Update Zappa deployment"
(working-directory: backend) and replace the command accordingly.
| resource "aws_dynamodb_table" "state_lock" { | ||
| name = "${var.project_name}-terraform-state-lock" | ||
| billing_mode = "PAY_PER_REQUEST" | ||
| hash_key = "LockID" | ||
| tags = merge(local.common_tags, { | ||
| Name = "${var.project_name}-terraform-state-lock" | ||
| }) | ||
|
|
||
| attribute { | ||
| name = "LockID" | ||
| type = "S" | ||
| } | ||
| lifecycle { | ||
| prevent_destroy = true | ||
| } | ||
| point_in_time_recovery { | ||
| enabled = true | ||
| } | ||
| server_side_encryption { | ||
| enabled = true | ||
| kms_key_arn = module.kms.key_arn | ||
| } | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Locate the KMS module referenced in the DynamoDB config
find . -type f -name "*.tf" | grep -i kms | head -20Repository: OWASP/Nest
Length of output: 172
🏁 Script executed:
# Find the modules directory structure
find . -type d -name "modules" -o -type d -name "kms"Repository: OWASP/Nest
Length of output: 352
🏁 Script executed:
# Look for the KMS module more broadly
fd '\.tf$' --path '*kms*'Repository: OWASP/Nest
Length of output: 269
🏁 Script executed:
# Read the KMS module main.tf to check the key policy
cat -n ./infrastructure/modules/kms/main.tfRepository: OWASP/Nest
Length of output: 2419
🏁 Script executed:
# Also check if there are any other files in the KMS module
ls -la ./infrastructure/modules/kms/Repository: OWASP/Nest
Length of output: 475
Add deletion_protection_enabled = true and fix the KMS key policy for DynamoDB encryption.
prevent_destroy only guards against terraform destroy; it does not prevent deletion via the AWS console or API. For a critical resource like the state lock table, setting deletion_protection_enabled = true adds AWS-level protection against accidental deletion.
More critically, the KMS module's key policy is missing the required statement to allow DynamoDB to use the key for encryption. The key policy currently grants permissions only to the root account and CloudWatch Logs, but the DynamoDB table specifies kms_key_arn = module.kms.key_arn without corresponding KMS grants. Add a statement to the KMS key policy in infrastructure/modules/kms/main.tf that allows the DynamoDB service to use the key with the required actions (kms:Encrypt, kms:Decrypt, kms:DescribeKey, kms:GenerateDataKey*, kms:ReEncrypt*, kms:CreateGrant) and a kms:ViaService condition set to dynamodb.*.amazonaws.com. Without this, encryption will fail at runtime.
🛡️ Proposed DynamoDB table fix
resource "aws_dynamodb_table" "state_lock" {
name = "${var.project_name}-terraform-state-lock"
billing_mode = "PAY_PER_REQUEST"
+ deletion_protection_enabled = true
hash_key = "LockID"🤖 Prompt for AI Agents
In `@infrastructure/backend/main.tf` around lines 73 - 95, The DynamoDB state_lock
resource (aws_dynamodb_table.state_lock) needs AWS-level deletion protection and
the KMS key must allow DynamoDB to use it: add deletion_protection_enabled =
true to the aws_dynamodb_table.state_lock resource, and update the KMS module
key policy (in infrastructure/modules/kms main policy referenced by
module.kms.key_arn) to include a statement granting DynamoDB the actions
kms:Encrypt, kms:Decrypt, kms:DescribeKey, kms:GenerateDataKey*, kms:ReEncrypt*,
kms:CreateGrant with a Condition kms:ViaService set to
dynamodb.<region>.amazonaws.com (or wildcard for region) so DynamoDB can use the
key for server_side_encryption; ensure the principal and resource fields match
the key policy structure and include the account root as needed.
| resource "aws_s3_bucket_lifecycle_configuration" "logs" { | ||
| bucket = aws_s3_bucket.logs.id | ||
|
|
||
| rule { | ||
| id = "expire-logs" | ||
| status = "Enabled" | ||
|
|
||
| abort_incomplete_multipart_upload { | ||
| days_after_initiation = var.abort_incomplete_multipart_upload_days | ||
| } | ||
| expiration { | ||
| days = var.expire_log_days | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Logs bucket: noncurrent versions will accumulate indefinitely.
Versioning is enabled on the logs bucket (Line 225), but the logs lifecycle configuration (Lines 136–150) only defines an expiration rule for current objects. With versioning enabled, expiration merely adds a delete marker — noncurrent versions are never removed. Add a noncurrent_version_expiration rule to prevent unbounded storage growth.
♻️ Proposed fix
resource "aws_s3_bucket_lifecycle_configuration" "logs" {
bucket = aws_s3_bucket.logs.id
rule {
id = "expire-logs"
status = "Enabled"
abort_incomplete_multipart_upload {
days_after_initiation = var.abort_incomplete_multipart_upload_days
}
expiration {
days = var.expire_log_days
}
+ noncurrent_version_expiration {
+ noncurrent_days = var.expire_log_days
+ }
}
}Alternatively, if retaining old log versions is not needed, consider whether versioning on the logs bucket is necessary at all — the state bucket is where versioning provides critical value.
Also applies to: 222-227
🤖 Prompt for AI Agents
In `@infrastructure/backend/main.tf` around lines 136 - 150, The lifecycle for the
S3 logs bucket (resource aws_s3_bucket_lifecycle_configuration "logs" targeting
aws_s3_bucket.logs) currently only uses expiration for current objects, so
noncurrent object versions will never be removed; update the lifecycle to add a
noncurrent_version_expiration block (e.g., days = var.noncurrent_expire_days or
a new variable) and/or noncurrent_version_transition rules as appropriate to
ensure older versions are purged, or else remove/disable versioning on
aws_s3_bucket.logs if retaining versions is not required; ensure you reference
the existing abort_incomplete_multipart_upload, expiration and variable names
(var.expire_log_days) when adding the noncurrent rules so the configuration is
consistent.
| 1. **Upload Fixture to S3**: | ||
|
|
||
| - Upload the fixture present in `backend/data` to `nest-fixtures` bucket using the following command: | ||
|
|
||
| ```bash | ||
| aws s3 cp data/nest.json.gz s3://owasp-nest-fixtures-<id>/ | ||
| ``` |
There was a problem hiding this comment.
Clarify the <id> placeholder in the S3 bucket name.
The bucket name includes an <id> placeholder that isn't explained. Unlike the region and account ID placeholders elsewhere, there's no NOTE clarifying what this represents (e.g., random suffix from Terraform output, AWS account ID, etc.).
💡 Suggested clarification
Add a NOTE block before or after the command:
> [!NOTE]
> Replace `<id>` with the unique identifier from the Terraform output `fixtures_bucket_name`.🤖 Prompt for AI Agents
In `@infrastructure/README.md` around lines 245 - 251, The README's S3 upload step
uses an unexplained placeholder `owasp-nest-fixtures-<id>`; add a short NOTE
near that command clarifying that `<id>` is the unique bucket suffix from the
Terraform output `fixtures_bucket_name` (or other source) and instruct the user
to replace it with that exact value; reference the command that mentions
`data/nest.json.gz` and the bucket name `owasp-nest-fixtures-<id>` so reviewers
can find where to insert the NOTE.
* Run make update * Clean up snapshot generated videos * Update backend/data/nest.dump * feat(ui): revamp corporate supporters carousel (Infinite Marquee + Dark Mode fix) (#3837) * feat(ui): revamp corporate supporters carousel (Infinite Marquee + Dark Mode fix) * fix: resolve failing test case * fix: add fallback text for unnamed sponsors * docs: add docstrings to satisfy coverage requirements * Run make check and fix tests. --------- Co-authored-by: Kate <kate@kgthreads.com> * Fix/redundant typescript assertion (#3834) * Fix Sonar S4325 by narrowing session user fields instead of casting * Fix unused ExtendedSession in mentorship page * fix: redundant-typescript-assertion * Fix stale latest date displayed in Project Health Dashboard metrics (#3842) * Fixed latest date in proejct health dashboard * updated order * Update code * Update code --------- Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org> * feat: improve backend test coverage to 96% (#3840) * feat: improve backend test coverage to 96% * fix comments * fix issues * fix issue * fix cubic-dev-ai comments * Update code * Fix tests --------- Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org> Co-authored-by: Arkadii Yakovets <2201626+arkid15r@users.noreply.github.com> * Fix: merge consecutive RUN instructions in frontend Dockerfile (#3644) * Fix: merge consecutive RUN instructions in frontend Dockerfile * fix: comment Dockerfile note to prevent syntax error * Update code * Update code --------- Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org> Co-authored-by: Arkadii Yakovets <2201626+arkid15r@users.noreply.github.com> * Fix 'is_merged' not being available on the Issue (#3843) * Fix 'is_merged' not being available on the Issue * Update code --------- Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org> * CI: Add ansible-lint workflow for Ansible playbooks (#3796) * ci: add ansible-lint workflow Signed-off-by: Muhammad Hassaan Saleem <iamhassaans@gmail.com> * Update .github/workflows/lint-ansible.yaml Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> * ci: add ansible-lint make target and workflow Signed-off-by: Muhammad Hassaan Saleem <iamhassaans@gmail.com> * ci: add ansible-lint pre-commit hook Signed-off-by: Muhammad Hassaan Saleem <iamhassaans@gmail.com> * fix: whitespace & version Signed-off-by: Muhammad Hassaan Saleem <iamhassaans@gmail.com> * Update Makefile Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com> * ci: enable ansible-lint scanning and add requirements.yml Signed-off-by: Muhammad Hassaan Saleem <iamhassaans@gmail.com> * chore(ansible):align linting and module usage Signed-off-by: Muhammad Hassaan Saleem <iamhassaans@gmail.com> * ci(ansible): install collections before deploy playbooks Signed-off-by: Muhammad Hassaan Saleem <iamhassaans@gmail.com> * Update code * Update code * Update .github/workflows/run-ci-cd.yaml --------- Signed-off-by: Muhammad Hassaan Saleem <iamhassaans@gmail.com> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com> Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org> * Fix ElevenLabs API error (#3861) * use default liam voice * bump speed by 0.10 --------- Co-authored-by: Arkadii Yakovets <2201626+arkid15r@users.noreply.github.com> * Add Ime Iyonsi to MENTORS.md (#3866) * Add mentor profile for Ime Iyonsi Added Ime Iyonsi's mentor profile. * Fix GitHub link for Ime Iyonsi Corrected GitHub link for Ime Iyonsi. * Update code --------- Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org> * Update MENTORS.md * Enabled Strict Mode (#3776) * Enabled Strict Mode * fixed ai review * fix * fixed review * fix * update test * Update code --------- Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org> Co-authored-by: Arkadii Yakovets <2201626+arkid15r@users.noreply.github.com> * Resolve case-sensitivity in QueryParser to support Chapters/Members search (#3844) * resolve query parser blocker * use case_sensitive flag in QueryParser * feat: add case_sensitive option to QueryParser and update tests * Update code --------- Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org> Co-authored-by: Arkadii Yakovets <2201626+arkid15r@users.noreply.github.com> * Update dependencies (#3874) * Update dependencies * Bump django-ninja version * fix(proxy): pin nginx and certbot images (#3848) * fix(proxy): pin nginx and certbot images Signed-off-by: Muhammad Hassaan Saleem <iamhassaans@gmail.com> * fix stable verssions Signed-off-by: Muhammad Hassaan Saleem <iamhassaans@gmail.com> --------- Signed-off-by: Muhammad Hassaan Saleem <iamhassaans@gmail.com> Co-authored-by: Arkadii Yakovets <2201626+arkid15r@users.noreply.github.com> * Update docker-compose/proxy/compose.yaml * Update backend/pyproject.toml * Update ansible lint configuration (#3880) * Update .github/ansible/.ansible-lint.yaml * Improve frontend test coverage above 80% and add missing test files (#3864) * Imrove test coverage to 80% and added test * Fixed coderabbit review * update code * fixed coderabbit ai * fixed soanrqube warning * fixed review * update * fixed aloglia cache_key (#3825) * fixed aloglia cache_key * change separator val to be semicolon (;) * Update code * add tests + use json filters * add trailing newline * Update code --------- Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org> Co-authored-by: Arkadii Yakovets <2201626+arkid15r@users.noreply.github.com> * fix: remove unused className prop from AnchorTitle component (#3822) * fix: remove unused className prop from AnchorTitle component Fixes #3805 The className prop was defined in AnchorTitleProps but never used in the component implementation. Removing it resolves Sonar rule typescript:S6767 and improves code maintainability. * fix: use className prop instead of removing it - Added className back to AnchorTitleProps interface - Accept className parameter in component - Apply className to root div element - Resolves reviewer feedback on PR #3822 * Update code --------- Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org> Co-authored-by: Arkadii Yakovets <2201626+arkid15r@users.noreply.github.com> --------- Signed-off-by: Muhammad Hassaan Saleem <iamhassaans@gmail.com> Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org> Co-authored-by: Yashraj Pahuja <yashrajpahuja9999@gmail.com> Co-authored-by: Kate <kate@kgthreads.com> Co-authored-by: CodeAritraDhank <aritradhank21@gmail.com> Co-authored-by: Anurag Yadav <143180737+anurag2787@users.noreply.github.com> Co-authored-by: Harshit Verma <harshit1092004@gmail.com> Co-authored-by: Arkadii Yakovets <2201626+arkid15r@users.noreply.github.com> Co-authored-by: Shuban Mutagi <shubanmutagi55@gmail.com> Co-authored-by: Muhammad Hassaan Saleem <iamhassaans@gmail.com> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com> Co-authored-by: emaybu <152900874+emaybu@users.noreply.github.com> Co-authored-by: sai chethana <saichethanavesireddy@gmail.com> Co-authored-by: Rahul Paul <179798584+Mr-Rahul-Paul@users.noreply.github.com> Co-authored-by: Lavanya <lavanyayadawad30@gmail.com>
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In @.github/workflows/run-ci-cd.yaml:
- Around line 420-437: The run-infrastructure-tests job currently lacks a
timeout-minutes setting which can allow Terraform to hang and waste runner
minutes; add a timeout-minutes (e.g., 15) to the run-infrastructure-tests job
definition so the entire job will be killed after that duration—update the job
block named "run-infrastructure-tests" (the job with name: Run Infrastructure
tests and the step that runs "make test-infrastructure") to include
timeout-minutes: 15.
In `@backend/pyproject.toml`:
- Line 14: The dependency entry dependencies.aws-xray-sdk = "^2.15.0" is fine
for now but AWS X-Ray SDKs enter maintenance mode on 2026-02-25; update the repo
by (A) keeping the current version pin as-is (dependencies.aws-xray-sdk =
"^2.15.0") to avoid breaking changes, (B) add a short note in the project
docs/CHANGELOG mentioning the maintenance-mode date and impact, and (C) create a
tracking issue/task to evaluate migrating to OpenTelemetry (include migration
criteria and target date before 2026-02-25) so the team can plan a replacement
before end-of-support.
🧹 Nitpick comments (2)
.github/workflows/run-ci-cd.yaml (2)
808-816: Guard against empty task ARN fromrun-task.If ECS fails to place the task (e.g., capacity issues),
tasks[0].taskArnresolves toNone(literal string). The subsequentwaitanddescribe-taskssteps would then fail with a confusing error. Add a guard:Proposed fix
TASK_ARN=$(aws ecs run-task \ --cluster "$CLUSTER_NAME" \ --task-definition "$TASK_DEFINITION" \ --launch-type FARGATE \ --network-configuration "awsvpcConfiguration={subnets=[$PRIVATE_SUBNET_IDS],securityGroups=[$ECS_SECURITY_GROUP_ID],assignPublicIp=DISABLED}" \ --query 'tasks[0].taskArn' \ --output text) + + if [ -z "$TASK_ARN" ] || [ "$TASK_ARN" = "None" ]; then + echo "::error::Failed to start migrate task. Check ECS capacity and task definition." + exit 1 + fi echo "task_arn=$TASK_ARN" >> $GITHUB_OUTPUT
499-501: Consider adding a version-specific ECR tag alongsidelatest.Both backend and frontend ECR images are tagged only as
:latest. Adding a version tag (e.g., usingRELEASE_VERSION) would enable rollbacks and make it easier to trace which commit is deployed. This is optional for staging but becomes important if this pattern is replicated for production.
| run-infrastructure-tests: | ||
| name: Run Infrastructure tests | ||
| needs: | ||
| - run-security-scan-code | ||
| permissions: | ||
| contents: read | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - name: Check out repository | ||
| uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd | ||
|
|
||
| - name: Install Terraform | ||
| uses: hashicorp/setup-terraform@b9cd54a3c349d3f38e8881555d616ced269862dd | ||
| with: | ||
| terraform_version: 1.14.0 | ||
|
|
||
| - name: Run Infrastructure tests | ||
| run: make test-infrastructure |
There was a problem hiding this comment.
Missing timeout-minutes on run-infrastructure-tests.
Other test jobs in this workflow consistently set timeout-minutes. Without one, the default 360-minute limit applies, which could waste runner minutes if Terraform hangs. Consider adding a reasonable timeout (e.g., 10–15 minutes).
Proposed fix
- name: Run Infrastructure tests
run: make test-infrastructure
+ timeout-minutes: 15🤖 Prompt for AI Agents
In @.github/workflows/run-ci-cd.yaml around lines 420 - 437, The
run-infrastructure-tests job currently lacks a timeout-minutes setting which can
allow Terraform to hang and waste runner minutes; add a timeout-minutes (e.g.,
15) to the run-infrastructure-tests job definition so the entire job will be
killed after that duration—update the job block named "run-infrastructure-tests"
(the job with name: Run Infrastructure tests and the step that runs "make
test-infrastructure") to include timeout-minutes: 15.
| packages = [ { include = "apps" } ] | ||
| dependencies.algoliasearch = "^4.13.2" | ||
| dependencies.algoliasearch-django = "^4.0.0" | ||
| dependencies.aws-xray-sdk = "^2.15.0" |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
aws-xray-sdk PyPI latest version 2026
💡 Result:
As of February 10, 2026, the latest PyPI release of aws-xray-sdk is 2.15.0, published Oct 29, 2025. [1]
Install/update:
pip install -U aws-xray-sdkSources: [1]
🌐 Web query:
aws-xray-sdk security vulnerabilities CVE 2025 2026
💡 Result:
Findings (CVE 2025/2026 for “aws-xray-sdk”)
- No CVE-2025 or CVE-2026 entries found for the official npm package
aws-xray-sdkin Snyk’s database (it reports no direct vulnerabilities for the latest release). [1] - No CVE-2025 or CVE-2026 entries found for the official PyPI package
aws-xray-sdkin Safety’s database (it reports no known security vulnerabilities for latest version 2.15.0, released Oct 29, 2025). [2]
Important related 2025 issue: malicious/typosquat packages (not CVEs)
- OSV flags malicious npm packages with confusingly similar names, e.g.
aws-xray-sdk-node(OSV MAL-2025-15232) as containing malicious code. [3] - Another example:
test-aws-xray-sdk-express(OSV MAL-2025-34766) is also flagged as malicious. [4]
Lifecycle note (affects patch cadence)
AWS announced the AWS X-Ray SDKs/Daemon enter maintenance mode on Feb 25, 2026, with releases limited to critical bug fixes and security updates, and end-of-support on Feb 25, 2027. [5]
What to do
- Ensure you’re using the official package names (
aws-xray-sdkon PyPI /aws-xray-sdkon npm), and block/alert on lookalike packages (likeaws-xray-sdk-node). [3][4] - Keep dependency scanning enabled (SCA), because “no direct CVE” doesn’t rule out indirect dependency issues. [1][2]
- Plan migration to OpenTelemetry given the maintenance-mode timeline. [5]
Sources: Snyk package page [1], Safety CLI DB [2], OSV malicious package entries [3][4], AWS announcement [5].
aws-xray-sdk 2.15.0 is the current latest release (Oct 29, 2025) with no known security vulnerabilities, but AWS X-Ray SDKs enter maintenance mode on Feb 25, 2026.
Version ^2.15.0 is valid and secure. However, note that AWS announced the X-Ray SDKs enter maintenance mode on Feb 25, 2026 (11 days away), limiting releases to critical bug fixes and security updates only, with end-of-support on Feb 25, 2027. Consider evaluating OpenTelemetry as a long-term alternative given this timeline.
🤖 Prompt for AI Agents
In `@backend/pyproject.toml` at line 14, The dependency entry
dependencies.aws-xray-sdk = "^2.15.0" is fine for now but AWS X-Ray SDKs enter
maintenance mode on 2026-02-25; update the repo by (A) keeping the current
version pin as-is (dependencies.aws-xray-sdk = "^2.15.0") to avoid breaking
changes, (B) add a short note in the project docs/CHANGELOG mentioning the
maintenance-mode date and impact, and (C) create a tracking issue/task to
evaluate migrating to OpenTelemetry (include migration criteria and target date
before 2026-02-25) so the team can plan a replacement before end-of-support.
…-migration-sync Sync feature/nest-zappa-migration with main
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @.github/workflows/run-ci-cd.yaml:
- Around line 861-873: The ECS "index-data" run-task step is fire-and-forget and
lacks failure detection; modify the Run ECS index-data task step
(TASK_DEFINITION: nest-staging-index-data) to capture the started task ARN from
the aws ecs run-task JSON output, log that ARN (and CLUSTER_NAME/POLICY
context), then poll or use aws ecs wait/describe-tasks against that task ARN to
wait for completion and inspect the containers[].exitCode and lastStatus,
failing the job (non-zero exit) or printing detailed failure info if the task
fails; mirror the migrate task's pattern for capturing the ARN, waiting, and
checking exit codes.
🧹 Nitpick comments (3)
.github/workflows/run-ci-cd.yaml (3)
473-478: Consider OIDC-based AWS authentication instead of static access keys.Three jobs (
build-staging-images,plan-staging-nest,deploy-staging-nest) useaws-access-key-id/aws-secret-access-keyfor authentication. For a public repository, using GitHub OIDC withrole-to-assumeis more secure — it eliminates long-lived credentials and reduces blast radius if secrets are compromised. This would require addingid-token: writeto the job permissions and configuring an IAM OIDC identity provider in AWS.Also applies to: 640-645, 737-742
652-663: Terraform backend configuration is duplicated between plan and deploy jobs.The
Prepare terraform backendstep is identical inplan-staging-nest(Lines 652–663) anddeploy-staging-nest(Lines 749–760). Consider extracting this into a composite action to reduce duplication and maintenance burden.Also applies to: 749-760
58-58: Terraform 1.14.0 is valid but not the latest patch.Terraform 1.14.5 is available as of now. Consider updating to the latest patch release for bug fixes, or pin to
~1.14to automatically pick up patches.Also applies to: 434-434, 650-650, 747-747
| - name: Run ECS index-data task | ||
| env: | ||
| CLUSTER_NAME: ${{ steps.tf-outputs.outputs.tasks_cluster_name }} | ||
| ECS_SECURITY_GROUP_ID: ${{ steps.tf-outputs.outputs.ecs_security_group_id }} | ||
| PRIVATE_SUBNET_IDS: ${{ steps.tf-outputs.outputs.private_subnet_ids }} | ||
| TASK_DEFINITION: nest-staging-index-data | ||
| run: | | ||
| aws ecs run-task \ | ||
| --cluster "$CLUSTER_NAME" \ | ||
| --task-definition "$TASK_DEFINITION" \ | ||
| --launch-type FARGATE \ | ||
| --network-configuration "awsvpcConfiguration={subnets=[$PRIVATE_SUBNET_IDS],securityGroups=[$ECS_SECURITY_GROUP_ID],assignPublicIp=DISABLED}" | ||
| echo "Index-data task started (runs async, ~30 min)" |
There was a problem hiding this comment.
Fire-and-forget index-data task has no failure detection.
Unlike the migrate task (Lines 837–858), the index-data ECS task is started without capturing its task ARN, waiting for completion, or checking the exit code. A silent failure here means stale search indexes with no CI signal. Consider at minimum capturing the task ARN and logging it for observability, or adding a non-blocking check step.
🤖 Prompt for AI Agents
In @.github/workflows/run-ci-cd.yaml around lines 861 - 873, The ECS
"index-data" run-task step is fire-and-forget and lacks failure detection;
modify the Run ECS index-data task step (TASK_DEFINITION:
nest-staging-index-data) to capture the started task ARN from the aws ecs
run-task JSON output, log that ARN (and CLUSTER_NAME/POLICY context), then poll
or use aws ecs wait/describe-tasks against that task ARN to wait for completion
and inspect the containers[].exitCode and lastStatus, failing the job (non-zero
exit) or printing detailed failure info if the task fails; mirror the migrate
task's pattern for capturing the ARN, waiting, and checking exit codes.



Proposed change
Resolves #3826
Merge feature/nest-zappa-migration to main
Checklist
make check-testlocally: all warnings addressed, tests passed